Using Python

Best Practice: Use a Python Virtual Environment

To avoid dependency conflicts and keep your environment clean, create and activate a Python virtual environment before installing any packages:

python3 -m venv venvsource venv/bin/activate

Install Dependencies

pip install llama-cpp-python pymilvus "pymilvus[model]"

Install Alith

python3 -m pip install alith -U

Set Environment Variables

For OpenAI/ChatGPT API:

export PRIVATE_KEY=<your wallet private key>
export OPENAI_API_KEY=<your openai api key>

For other OpenAI-compatible APIs (DeepSeek, Gemini, etc.):

export PRIVATE_KEY=<your wallet private key>
export LLM_API_KEY=<your api key>
export LLM_BASE_URL=<your api base url>

Step 1: Run the Inference Server

Note: The public address of the private key you expose to the inference server is the LAZAI_IDAO_ADDRESS. Once the inference server is running, the URL must be registered using the add_inference_node function in Alith. This can only be done by LazAI admins.

Local Development

For OpenAI/ChatGPT API:

from alith.inference import run
 
"""Run the server and use the following command to test the server
 
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-LazAI-User: 0xc3e98E8A9aACFc9ff7578C2F3BA48CA4477Ecf49" \
-H "X-LazAI-Nonce: 123456" \
-H "X-LazAI-Signature: HSDGYUSDOWP123" \
-H "X-LazAI-Token-ID: 1" \
-d '{
  "model": "gpt-3.5-turbo",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "temperature": 0.7,
  "max_tokens": 100
}'
"""
server = run(model="gpt-3.5-turbo", settlement=True, engine_type="openai")

For other OpenAI-compatible APIs (DeepSeek, Gemini, etc.):

from alith.inference import run
 
# Example: Using DeepSeek model from OpenRouter
server = run(settlement=True, engine_type="openai", model="deepseek/deepseek-r1-0528")

Production Deployment on Phala TEE Cloud

For production-ready applications, deploy your inference server on Phala TEE Cloud  for enhanced security and privacy. Once deployed, you will receive an inference URL that needs to be registered using the add_inference_node function by LazAI admins.

You can also use the existing inference nodes.

Step 2: Request Inference via LazAI Client

from alith import Agent, LazAIClient
 
# 1. Join the iDAO, register user wallet on LazAI and deposit fees (Only Once)
LAZAI_IDAO_ADDRESS = "0xc3e98E8A9aACFc9ff7578C2F3BA48CA4477Ecf49" # Replace with your own address
client = LazAIClient()

try:
    client.get_user(client.wallet.address)
    print("User already exists")
except Exception:
    print("User does not exist, adding user")
    client.add_user(10000000)
    client.deposit_inference(LAZAI_IDAO_ADDRESS, 1000000)
# 2. Request the inference server with the settlement headers and DAT file id
file_id = 11  # Use the File ID you received from the Data Contribution step
url = client.get_inference_node(LAZAI_IDAO_ADDRESS)[1]
print("url", url)
agent = Agent(
    # Note: replace with your model here
    model="gpt-3.5-turbo",

    base_url=f"{url}/v1",
    # Extra headers for settlement and DAT file anchoring
    extra_headers=client.get_request_headers(LAZAI_IDAO_ADDRESS, file_id=file_id),
)
print(agent.prompt("summarize it"))

Security & Privacy

  • Your data never leaves your control. Inference is performed in a privacy-preserving environment, using cryptographic settlement and secure computation.

  • Settlement headers ensure only authorized users and nodes can access your data for inference.

  • File ID links your inference request to the specific data you contributed, maintaining a verifiable chain of custody.

Last updated