Issue with Streaming SSE Responses in TTMSFNCCustomCloudBase on Windows

giltahan · April 11, 2025, 7:34am

Hello,

I'm encountering an issue when using TTMSFNCCustomCloudBase on Windows to call a server-sent events (SSE) endpoint. The endpoint should return data as a real-time stream, but instead the response is fully buffered and only delivered once the entire stream is complete.

Important note: When I run the exact same request using curl --no-buffer from the Windows command line, the response is streamed correctly in real time — so the issue is not with the Gemini API or Windows itself, but appears specific to how the TMS component handles the response.

This problem happens with both .Curl(...) and .ExecuteRequest(...).

Here’s a minimal FireMonkey example using .Curl(...) that shows the issue:

unit Unit1;

interface

uses
  System.SysUtils, System.Types, System.UITypes, System.Classes, System.Variants,
  FMX.Types, FMX.Controls, FMX.Forms, FMX.Graphics, FMX.Dialogs, FMX.Memo.Types,
  FMX.StdCtrls, FMX.Controls.Presentation, FMX.ScrollBox, FMX.Memo, FMX.TMSFNCCloudBase;

type
  TForm1 = class(TForm)
    Memo1: TMemo;
    Button1: TButton;
    procedure Button1Click(Sender: TObject);
  private
    procedure RequestResultStringEvent(const AResult: string);
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.fmx}

procedure TForm1.RequestResultStringEvent(const AResult: string);
begin
  Memo1.Lines.Add(AResult);
end;

procedure TForm1.Button1Click(Sender: TObject);
var CurlCmd: string;
const YOUR_API_KEY='AIzaSyBNlXAPvYxukCSh0A28oddLadwVB2xczck';       // key from  https://aistudio.google.com/apikey
begin
  CurlCmd := 'CURL "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key='+YOUR_API_KEY+'" ' +
             '-H "Content-Type: application/json" ' +
             '--no-buffer ' +
             '-d ''{"contents": [{"parts": [{"text": "Explain how AI works"}]}]}''';

  Memo1.Lines.Add('Starting: '+CurlCmd);

  TTMSFNCCustomCloudBase.Curl(CurlCmd, RequestResultStringEvent);
end;

end.

Even when I build and execute a structured request using ExecuteRequest(...), I get the same buffering behavior:

procedure TGeminiClient.BuildRequest(id: string = '');
const
  cHeaderContent = 'Content-Type';
  cHeaderContentValue = 'application/json';
  cHostGemini = 'https://generativelanguage.googleapis.com';
  cPathGemini = 'v1beta/models/';
begin
  Request.Headers.Clear;
  Request.AddHeader(cHeaderContent, cHeaderContentValue);
  Request.Method := rmPOST;
  Request.Name := id;

  Request.Host := cHostGemini;
  Request.Path := cPathGemini + FSettings.Model + ':streamGenerateContent?alt=sse&key=' + FSettings.APIKey;
  Request.PostData := BuildPostData();
end;

function TGeminiClient.Execute(id: string = ''): boolean;
begin
  Inc(FBusyCount);
  BuildRequest(id);
  ExecuteRequest(DoGeminiRequest, DoStreamProgress, True);
  Result := True;
end;

Project1.dpr (215 Bytes)
Project1.dproj (73.7 KB)
Unit1.pas (1.3 KB)

Is there any setting or workaround in the TMS FNC Cloud framework to allow processing the stream in real time without buffering?

Also, perhaps I’ve missed some documentation or manual that explains how TTMSFNCCustomCloudBase handles streamed responses? If there's something I should review, I’d appreciate a pointer.

P.S. Unrelated, but is it possible to use the TMS FNC REST client to upload a binary file (e.g., as multipart/form-data or raw bytes)? Or is that not its intended use?

Thanks again for your help!

Thanks in advance for your help!

Pieter · April 11, 2025, 8:02am

Hi,

You have 3 kinds of result types:

rrtString: The default return value of the request result. Can be XML, JSON, or any other type of text. The property ARequestResult.ResultString contains the value.
rrtStream: Returns the content as a stream, the property ARequestResult.ResultStream contains the content
rrtFile: Immediately saves the content to a file, specified by c.Request.ResultFile before executing the request.

You can set the required result type with Request.ResultType.

Keep in mind that the first 2 will create a memory allocation, while the rrtFile will directly stream to a file.

giltahan · April 11, 2025, 10:50am

Thanks, Pieter — I appreciate the clarification regarding the different ResultType options (rrtString, rrtStream, rrtFile).

However, I think this may not fully address the core problem I'm reporting.

In my case, I'm calling the Gemini API using their streamGenerateContent?alt=sse endpoint, which returns server-sent events (SSE) — a format specifically designed to deliver very small chunks of data in real time. The whole point of this endpoint is to begin processing the stream immediately (token by token, message by message) as the server pushes updates.

When using TMS FNC Cloud components (TTMSFNCCustomCloudBase), whether I choose rrtString, rrtStream, or even .Curl(...), it appears everything is fully buffered internally — and I only receive data after the stream is completely finished. That defeats the purpose of using SSE and makes the real-time behavior impossible.

I verified that on the same Windows machine, if I run the identical request via curl --no-buffer, I receive the stream correctly in real time, as expected. So it’s not a limitation of the Gemini API or the platform — it's something in how the TMS component handles the response.

Could you please clarify:

Does TTMSFNCCustomCloudBase currently support true streaming (i.e., chunked HTTP response handling)?
If not, can this be addressed as a bug or missing feature, since it's a key use case for modern APIs (including AI and chat services)?
Is there any low-level event like OnReceiveData that I can hook into during the response to process partial chunks?
And lastly — as asked before — is there any documentation/manual that explains this aspect of TTMSFNCCustomCloudBase in more detail?
Also, unrelated but important: Can the TMS FNC REST client be used to upload binary files (e.g., multipart/form-data or raw bytes)? If so, any example or pointer would be appreciated.

Thanks again — I really hope this can be clarified or improved, as live streaming support is essential for some of the modern use cases I’m working on.

Best regards,
Gil.

Pieter · April 11, 2025, 11:23am

There is unfortunately no event that is triggered to handle chunks. We'll add this on our feature request list. There is no specific detailed documentation on TTMSFNCCloudBase. It should be possible to upload multi-part form data. See an example here.

PostDataBuilder.Clear;
PostDataBuilder.AddFormData('userfile', '', True,
ExtractFileName(ARequestResult.DataString), 'application/octet-stream');
s := PostDataBuilder.Build;
Request.Clear;
Request.Name := 'UPLOAD FILE';
Request.Host := Service.BaseURL;
Request.Path := '/upload' + FBasePath + id + '?access_token=' +
Authentication.AccessToken;
Request.Method := rmPUTMULTIPART;
Request.PostData := s;
Request.UploadFile := ARequestResult.DataUpload;
ExecuteRequest(DoRequestUploadFile);

Topic		Replies	Views
Documentation issues and errors TMS FMX Cloud Pack	13	1340	September 18, 2019
Post, Update and Delete methods TMS XData	12	1511	August 11, 2020
Send large Blobs will raise integeroverflow TMS XData	10	381	December 5, 2023
Download file using service xdata and webcore save how binary file TMS WEB Core tmswebcore	12	1406	March 23, 2022
Compress Middleware and Tstream TMS XData	15	181	November 15, 2024

Issue with Streaming SSE Responses in TTMSFNCCustomCloudBase on Windows

Related topics