Data entry – the process of converting verbal or written responses to electronic form328.
Data fusion — the process of integrating multiple data sources to produce more consistent, accurate, and useful information than that provided by any individual data source329.
Data Integration involves the combination of data residing in different resources and then the supply in a unified view to the users. Data integration is in high demand for both commercial and scientific domains in which they need to merge the data and research results from different repositories330.
Data is a collection of qualitative and quantitative variables. It contains the information that is represented numerically and needs to be analyzed.
Data Lake is a type of data repository that stores data in its natural format and relies on various schemata and structure to index the data331.
Data markup is the stage of processing structured and unstructured data, during which data (including text documents, photo and video images) are assigned identifiers that reflect the type of data (data classification), and (or) data is interpreted to solve a specific problem, in including using machine learning methods (National Strategy for the Development of Artificial Intelligence for the period up to 2030)332.
Data Mining is the process of data analysis and information extraction from large amounts of datasets with machine learning, statistical approaches. and many others333.
Data parallelism is a way of scaling training or inference that replicates an entire model onto multiple devices and then passes a subset of the input data to each device. Data parallelism can enable training and inference on very large batch sizes; however, data parallelism requires that the model be small enough to fit on all devices. See also model parallelism334.
Data Processing Unit (DPU) is a programmable specialized electronic circuit with hardware accelerated data processing for data-oriented computing335.
Data protection is the process of protecting data and involves the relationship between the collection and dissemination of data and technology, the public perception and expectation of privacy and the political and legal underpinnings surrounding that data. It aims to strike a balance between individual privacy rights while still allowing data to be used for business purposes336.
Data Refinement is used to convert an abstract data model in terms of sets for example into implementable data structures such as arrays337.
Data Science is a broad grouping of mathematics, statistics, probability, computing, data visualization to extract knowledge from a heterogeneous set of data (images, sound, text, genomic data, social network links, physical measurements, etc.). The methods and tools derived from artificial intelligence are part of this family338,339.
Data set is a set of data that has undergone preliminary preparation (processing) in accordance with the requirements of the legislation of the Russian Federation on information, information technology and information protection and is necessary for the development of software based on artificial intelligence (National strategy for the development of artificial intelligence for the period up to 2030)340.
Data Streaming Accelerator (DSA) is a device that performs a specific task, which in this case is the transfer of data in less time than the CPU would do. What makes DSA special is that it is designed for one of the characteristics that Compute Express Link brings with it over PCI Express 5.0, which is to provide consistent access to RAM for all peripherals connected to a PCI Express port, i.e., they use the same memory addresses.