Local AI Context

My testing and googling suggest that the answer is no but I wanted to confirm it with the hive mind. Can a local LLM with vision capabilities run the ocr demo. I have tested in ollama and lm studio (llama.cpp) running recent gemma models, which operate successfully in both inference engines’ chat interfaces, but it does not appear that I can send files through TTMSFNCCloudAI. Can someone please confirm whether this is a limitation of the inference engine or AI Studio, or indeed my own ineptitude :slight_smile: .

Just retried here. I could get OCR to work with the included demo that forces the llama3.2-vision model for Ollama 0.21.0.

Thanks Bruno, by hardwiring another vision model, per the following…
TMSMCPCloudAI1.Settings.OllamaModel := 'gemma4:26b-a4b-it-q4_K_M';
TMSMCPCloudAI1.Settings.OllamaPath := 'api/generate';

…I was able to get the ocr demo working.

To your knowledge, should this be possible through LM Studio? My end goal is supplying a PDF for analysis, but I have not had any success yet. The “/v1/chat/completions” endpoint is being hit but no file I am adding is being uploaded even when setting the aiPDF type. As a fall back, I can use the RAG example supplied from Christmas using pdfium but I’d really like to just use the llm.

PS. Can I suggest that the requirements of the Demo are made a bit clearer. It was not obvious to me that the model had been hardwired or that it was necessary to download it in order for it to work.