Benchmarking is a set of techniques that allow you to study the experience of competitors and implement best practices in your company151.
BETA refers to a phase in online service development in which the service is coming together functionality-wise but genuine user experiences are required before the service can be finished in a user-centered way. In online service development, the aim of the beta phase is to recognize both programming issues and usability-enhancing procedures. The beta phase is particularly often used in connection with online services and it can be either freely available (open beta) or restricted to a specific target group (closed beta)152.
Bias is a systematic trend that causes differences between results and facts. Error exists in the numbers of the data analysis process, including the source of the data, the estimate chosen, and how the data is analyzed. Error can seriously affect the results, for example, when studying people’s shopping habits. If the sample size is not large enough, the results may not reflect the buying habits of all people. That is, there may be discrepancies between survey results and actual results153.
Biased algorithm – systematic and repetitive errors in a computer system that lead to unfair results, such as one privilege persecuting groups of users over others. Also, sexist and racist algorithms154,155.
Bidirectional (BiDi) is a term used to describe a system that evaluates the text that both precedes and follows a target section of text. In contrast, a unidirectional system only evaluates the text that precedes a target section of text156.
Bidirectional Encoder Representations from Transformers (BERT) is a model architecture for text representation. A trained BERT model can act as part of a larger model for text classification or other ML tasks. BERT has the following characteristics: Uses the Transformer architecture, and therefore relies on self-attention. Uses the encoder part of the Transformer. The encoder’s job is to produce good text representations, rather than to perform a specific task like classification. Is bidirectional. Uses masking for unsupervised training157,158.
Bidirectional language model is a language model that determines the probability that a given token is present at a given location in an excerpt of text based on the preceding and following text159.
Big data is a term for sets of digital data whose large size, rate of increase or complexity requires significant computing power for processing and special software tools for analysis and presentation in the form of human-perceptible results160.
Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. It is a member of a family of notations invented by Paul Bachmann, Edmund Landau, and others, collectively called Bachmann—Landau notation or asymptotic notation161.
Bigram – an N-grams in which N=2162.
Binary choice regression model is a regression model in which the dependent variable is dichotomous or binary. Dependent variable can take only two values and mean, for example, belonging to a particular group163.
Binary classification is a type of classification task that outputs one of two mutually exclusive classes. For example, a machine learning model that evaluates email messages and outputs either «spam» or «not spam» is a binary classifier164.
Binary format is any file format in which information is encoded in some format other than a standard character-encoding scheme. A file written in binary format contains information that is not displayable as characters. Software capable of understanding the particular binary format method of encoding information must be used to interpret the information in a binary-formatted file. Binary formats are often used to store more information in less space than possible in a character format file. They can also be searched and analyzed more quickly by appropriate software. A file written in binary format could store the number «7» as a binary number (instead of as a character) in as little as 3 bits (i.e., 111), but would more typically use 4 bits (i.e., 0111). Binary formats are not normally portable, however. Software program files are written in binary format. Examples of numeric data files distributed in binary format include the IBM-binary versions of the Center for Research in Security Prices files and the U.S. Department of Commerce’s National Trade Data Bank on CD-ROM. The International Monetary Fund distributes International Financial Statistics in a mixed-character format and binary (packed-decimal) format. SAS and SPSS store their system files in binary format