POV Data Structure

POV is a protocol and standard for data unification In LazAI. A basic POV data includes the following fields:

A basic POV data includes the following fields:

message POV {

bytes data_hash = 1;

string data_tag = 2; // LLM weight, dataset metadata, user info, inference prompt

string data_type = 2; // f32, f16, …

Bytes data_size = 3; // data tensor size

Tensor data = 4;

uint64 timestamp = 5;

Proof proof = 6; // Optional data proof

}

Considering that different users may have different data sources and modalities, which typically include text, speech, images, videos, and other modalities as well as different formats, we need to unify them into a fixed POV format. POV uses tensors for encoding and decoding, which records information such as hash, class, and timestamp of the data. Users need to submit the compressed POV data format for uploading on the chain.

Privacy Data & Public Data

At LazAI, we encourage open data formats and content, so POV supports storing data in a public form on the chain that anyone can access. However, for private data, LazAI also supports off chain storage. LazAI can efficiently verify the integrity and consistency of off chain data without storing all data on the chain. LazAI provides higher privacy protection and trustworthiness assurance for off chain data, avoiding potential risks of centralized data sources and promoting the development of decentralized data governance. iDAO,Its function is similar to that of a traditional data center, but the data does not need to be directly stored on the blockchain. Users only need to submit a proof for uploading to the blockchain.

LazAI mainly supports private data on-chain. In theory, it can support data of any size to calculate its credentials on-chain. For public data, it will be limited to 32K, which is similar to the storage requirements of the chain and the context window of general AI models. Through the Alith Agent toolkit, users can perform data preprocessing and proof calculation locally, and upload data to the LazAI network, making the data available but invisible.

Privacy Data Workflow

In LazAI, we use OpenPGP to encrypt and decrypt the privacy data. OpenPGP encryption process: Randomly generate a Key and use it to encrypt data using a symmetric encryption algorithm. Finally, use an asymmetric encryption algorithm (RSA) to encrypt the recipient's Key using the receiver's public key to obtain encrypted data.

Specifically, we obtain a random key from the user's Web3 wallet, such as Metamask, and use this random key to encrypt the user's private data. This process verifies the sender's identity, and then uses the asymmetric encryption algorithm (RSA) to encrypt the recipient's key (which can be obtained by requesting LazAI's data registration contract to obtain the public key) to obtain the encrypted data. We then upload the encrypted data to DA (such as IPFS, Google Drive, or Dropbox), and register the encrypted data URL and Encrypt key to the LazAI contract. The test data verifier can decrypt the encrypted key using the private key and download the private data for decryption through the URL. The decryption process is placed in TEE to ensure data security. Will not be tampered with, and then decrypted data and TEE generates proof and uploads it to LazAI contract for verification.

How to expand the Data Structure

Considering that user data is mainly used for model training and on chain inference, the only guarantee is that they need to meet the input requirements of the model. Therefore, LazAI provides the Alith Agent framework and Model Context Protocol (MCP) to complete multimodal and various data format preprocessing, as well as POV tensor data conversion. The advantage of this is that we can support as many types of data as possible at the application layer without modifying the blockchain network and protocols.

Last updated