このページは Cloud Translation API によって翻訳されました。

Gemini Live API
コレクションでコンテンツを整理必要に応じて、コンテンツの保存と分類を行います。

Gemini Live API を使用すると、Gemini との双方向の音声と動画による低レイテンシのやり取りが可能になります。Live API を使用すると、エンドユーザーに自然で人間のような音声会話のエクスペリエンスを提供できます。また、音声コマンドを使用してモデルのレスポンスを中断することもできます。Live API は、テキスト、音声、動画の入力を処理でき、テキストおよび音声出力を生成できます。

機能

Live API の技術仕様は次のとおりです。

入力: テキスト、音声、動画
出力: テキストと音声（合成音声）
デフォルトのセッション時間: 10 分
- セッションの長さは必要に応じて 10 分単位で延長できます
コンテキストウィンドウ: 32,000 トークン
8 種類の音声から返信音声を選択可能
31 言語での回答のサポート

Live API を使用する

以降のセクションでは、Live API の機能を使用する方法の例を示します。

詳細については、Gemini Live API リファレンスガイドをご覧ください。

テキストを送信して音声を受信する

Gen AI SDK for Python

voice_name = "Aoede"  # @param ["Aoede", "Puck", "Charon", "Kore", "Fenrir", "Leda", "Orus", "Zephyr"]

config = LiveConnectConfig(
    response_modalities=["AUDIO"],
    speech_config=SpeechConfig(
        voice_config=VoiceConfig(
            prebuilt_voice_config=PrebuiltVoiceConfig(
                voice_name=voice_name,
            )
        ),
    ),
)

async with client.aio.live.connect(
    model=MODEL_ID,
    config=config,
) as session:
    text_input = "Hello? Gemini are you there?"
    display(Markdown(f"**Input:** {text_input}"))

    await session.send_client_content(
        turns=Content(role="user", parts=[Part(text=text_input)]))

    audio_data = []
    async for message in session.receive():
        if (
            message.server_content.model_turn
            and message.server_content.model_turn.parts
        ):
            for part in message.server_content.model_turn.parts:
                if part.inline_data:
                    audio_data.append(
                        np.frombuffer(part.inline_data.data, dtype=np.int16)
                    )

    if audio_data:
        display(Audio(np.concatenate(audio_data), rate=24000, autoplay=True))

テキストの送受信

Gen AI SDK for Python

インストール

pip install --upgrade google-genai

詳細については、 SDK リファレンスドキュメントをご覧ください。

Vertex AI で Gen AI SDK を使用するための環境変数を設定します。

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import (
    Content,
    LiveConnectConfig,
    HttpOptions,
    Modality,
    Part,
)

client = genai.Client(http_options=HttpOptions(api_version="v1beta1"))
model_id = "gemini-2.0-flash-live-preview-04-09"

async with client.aio.live.connect(
    model=model_id,
    config=LiveConnectConfig(response_modalities=[Modality.TEXT]),
) as session:
    text_input = "Hello? Gemini, are you there?"
    print("> ", text_input, "\n")
    await session.send_client_content(
        turns=Content(role="user", parts=[Part(text=text_input)])
    )

    response = []

    async for message in session.receive():
        if message.text:
            response.append(message.text)

    print("".join(response))
# Example output:
# >  Hello? Gemini, are you there?
# Yes, I'm here. What would you like to talk about?

音声を送信する

Gen AI SDK for Python

import asyncio
import wave
from google import genai

client = genai.Client(api_key="GEMINI_API_KEY", http_options={'api_version': 'v1alpha'})
model = "gemini-2.0-flash-live-preview-04-09"

config = {"response_modalities": ["AUDIO"]}

async def main():
    async with client.aio.live.connect(model=model, config=config) as session:
        wf = wave.open("audio.wav", "wb")
        wf.setnchannels(1)
        wf.setsampwidth(2)
        wf.setframerate(24000)

        message = "Hello? Gemini are you there?"
        await session.send_client_content(
            turns=Content(role="user", parts=[Part(text=message)]))

        async for idx,response in async_enumerate(session.receive()):
            if response.data is not None:
                wf.writeframes(response.data)

            # Un-comment this code to print audio data info
            # if response.server_content.model_turn is not None:
            #      print(response.server_content.model_turn.parts[0].inline_data.mime_type)

        wf.close()

if __name__ == "__main__":
    asyncio.run(main())

Live API は、次の音声形式をサポートしています。

入力音声形式: RAW 16 ビット PCM 音声、16kHz、リトルエンディアン
出力音声形式: RAW 16 ビット PCM 音声、24kHz、リトルエンディアン

音声文字起こし

Live API では、入力音声と出力音声の両方を文字に変換できます。

Gen AI SDK for Python

# Set model generation_config
CONFIG = {
    'response_modalities': ['AUDIO'],
}

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {bearer_token[0]}",
}

# Connect to the server
async with connect(SERVICE_URL, additional_headers=headers) as ws:
    # Setup the session
    await ws.send(
        json.dumps(
            {
                "setup": {
                    "model": "gemini-2.0-flash-live-preview-04-09",
                    "generation_config": CONFIG,
                    'input_audio_transcription': {},
                    'output_audio_transcription': {}
                }
            }
        )
    )

    # Receive setup response
    raw_response = await ws.recv(decode=False)
    setup_response = json.loads(raw_response.decode("ascii"))

    # Send text message
    text_input = "Hello? Gemini are you there?"
    display(Markdown(f"**Input:** {text_input}"))

    msg = {
        "client_content": {
            "turns": [{"role": "user", "parts": [{"text": text_input}]}],
            "turn_complete": True,
        }
    }

    await ws.send(json.dumps(msg))

    responses = []
    input_transcriptions = []
    output_transcriptions = []

    # Receive chucks of server response
    async for raw_response in ws:
        response = json.loads(raw_response.decode())
        server_content = response.pop("serverContent", None)
        if server_content is None:
            break

        if (input_transcription := server_content.get("inputTranscription")) is not None:
            if (text := input_transcription.get("text")) is not None:
                input_transcriptions.append(text)
        if (output_transcription := server_content.get("outputTranscription")) is not None:
            if (text := output_transcription.get("text")) is not None:
                output_transcriptions.append(text)

        model_turn = server_content.pop("modelTurn", None)
        if model_turn is not None:
            parts = model_turn.pop("parts", None)
            if parts is not None:
                for part in parts:
                    pcm_data = base64.b64decode(part["inlineData"]["data"])
                    responses.append(np.frombuffer(pcm_data, dtype=np.int16))

        # End of turn
        turn_complete = server_content.pop("turnComplete", None)
        if turn_complete:
            break

    if input_transcriptions:
        display(Markdown(f"**Input transcription >** {''.join(input_transcriptions)}"))

    if responses:
        # Play the returned audio message
        display(Audio(np.concatenate(responses), rate=24000, autoplay=True))

    if output_transcriptions:
        display(Markdown(f"**Output transcription >** {''.join(output_transcriptions)}"))

音声と言語の設定を変更する

Live API は Chirp 3 を使用して、8 種類の HD 音声と 31 言語の合成音声レスポンスをサポートしています。

次の音声から選択できます。

Aoede（女性）
Charon（男性）
Fenrir（男性）
Kore（女性）
Leda（女性）
Orus（男性）
Puck（男性）
Zephyr（女性）

これらの音声のデモと、使用可能な言語の一覧については、Chirp 3: HD 音声をご覧ください。

レスポンスの音声と言語を設定するには:

Gen AI SDK for Python

config = LiveConnectConfig(
    response_modalities=["AUDIO"],
    speech_config=SpeechConfig(
        voice_config=VoiceConfig(
            prebuilt_voice_config=PrebuiltVoiceConfig(
                voice_name=voice_name,
            )
        ),
        language_code="en-US",
    ),
)

Console

Vertex AI Studio > Live API を開きます。
[出力] 展開パネルで、[音声] プルダウンから音声を選択します。
同じ展開パネルで、[言語] プルダウンから言語を選択します。
[ セッションを開始] をクリックしてセッションを開始します。

英語以外の言語でプロンプトを表示し、モデルに応答を要求する場合は、システム指示に次の文言を含めてください。

RESPOND IN LANGUAGE. YOU MUST RESPOND UNMISTAKABLY IN LANGUAGE.

会話をストリーミングする

Gen AI SDK for Python

テキストプロンプトを送信して音声レスポンスを受信できる API との会話を設定します。

# Set model generation_config
CONFIG = {"response_modalities": ["AUDIO"]}

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {bearer_token[0]}",
}

async def main() -> None:
    # Connect to the server
    async with connect(SERVICE_URL, additional_headers=headers) as ws:

        # Setup the session
        async def setup() -> None:
            await ws.send(
                json.dumps(
                    {
                        "setup": {
                            "model": "gemini-2.0-flash-live-preview-04-09",
                            "generation_config": CONFIG,
                        }
                    }
                )
            )

            # Receive setup response
            raw_response = await ws.recv(decode=False)
            setup_response = json.loads(raw_response.decode("ascii"))
            print(f"Connected: {setup_response}")
            return

        # Send text message
        async def send() -> bool:
            text_input = input("Input > ")
            if text_input.lower() in ("q", "quit", "exit"):
                return False

            msg = {
                "client_content": {
                    "turns": [{"role": "user", "parts": [{"text": text_input}]}],
                    "turn_complete": True,
                }
            }

            await ws.send(json.dumps(msg))
            return True

        # Receive server response
        async def receive() -> None:
            responses = []

            # Receive chucks of server response
            async for raw_response in ws:
                response = json.loads(raw_response.decode())
                server_content = response.pop("serverContent", None)
                if server_content is None:
                    break

                model_turn = server_content.pop("modelTurn", None)
                if model_turn is not None:
                    parts = model_turn.pop("parts", None)
                    if parts is not None:
                        for part in parts:
                            pcm_data = base64.b64decode(part["inlineData"]["data"])
                            responses.append(np.frombuffer(pcm_data, dtype=np.int16))

                # End of turn
                turn_complete = server_content.pop("turnComplete", None)
                if turn_complete:
                    break

            # Play the returned audio message
            display(Markdown("**Response >**"))
            display(Audio(np.concatenate(responses), rate=24000, autoplay=True))
            return

        await setup()

        while True:
            if not await send():
                break
            await receive()

会話を開始してプロンプトを入力するか、q、quit、exit を入力して終了します。

await main()

Console

Vertex AI Studio > Live API を開きます。
[ セッションを開始] をクリックして会話セッションを開始します。

セッションを終了するには、[セッションを停止] をクリックします。

セッションの長さ

会話セッションのデフォルトの最大長は 10 分です。セッションが終了する 60 秒前に、go_away 通知（BidiGenerateContentServerMessage.go_away）がクライアントに返送されます。

API を使用する場合は、セッションの長さを 10 分単位で延長できます。セッションを延長できる回数に制限はありません。セッションの長さを延長する方法の例については、セッションの再開を有効または無効にするをご覧ください。現在、この機能は API でのみ使用でき、Vertex AI Studio では使用できません。

コンテキストウィンドウ

Live API のセッションの最大コンテキスト長は、デフォルトで 32,768 トークンです。これは、音声の場合は 25 トークン / 秒（TPS）、動画の場合は 258 TPS のレート、およびテキストベースの入力、モデル出力などのその他のコンテンツでストリーミングされるリアルタイムデータを保存するために割り当てられます。

コンテキストウィンドウがコンテキストの最大長を超えると、コンテキストウィンドウ内の最も古いターンのコンテキストが切り捨てられ、コンテキストウィンドウ全体のサイズが制限内に収まるようにします。

セッションのデフォルトのコンテキスト長と、切り捨て後のターゲットコンテキスト長は、それぞれ設定メッセージの context_window_compression.trigger_tokens フィールドと context_window_compression.sliding_window.target_tokens フィールドを使用して構成できます。

同時セッション数

デフォルトでは、プロジェクトあたり最大 10 個の同時実行セッションを設定できます。

セッション中にシステムの手順を更新する

Live API を使用すると、アクティブなセッションの途中でシステム指示を更新できます。これを使用して、セッション中にモデルのレスポンスを適応させることができます。たとえば、モデルが応答する言語を別の言語に変更したり、モデルが応答するトーンを変更したりできます。

音声アクティビティ検出の設定を変更する

デフォルトでは、モデルは連続した音声入力ストリームに対して、音声アクティビティ検出（VAD）を自動的に実行します。VAD は、設定メッセージの realtimeInputConfig.automaticActivityDetection フィールドで構成できます。

音声ストリームが 1 秒以上一時停止した場合（ユーザーがマイクをオフにした場合など）、キャッシュに保存されている音声をフラッシュするために audioStreamEnd イベントを送信する必要があります。クライアントはいつでも音声データの送信を再開できます。

または、設定メッセージで realtimeInputConfig.automaticActivityDetection.disabled を true に設定して、自動 VAD を無効にすることもできます。この構成では、クライアントがユーザーの音声を検出し、適切なタイミングで activityStart メッセージと activityEnd メッセージを送信します。この構成では audioStreamEnd は送信されません。代わりに、ストリームの中断は activityEnd メッセージでマークされます。

セッションの再開を有効または無効にする

この機能はデフォルトでは無効になっています。キャッシュ保存は、API リクエストでフィールドを指定して API を呼び出すたびにユーザーが有効にする必要があります。キャッシュに保存されたデータにはプロジェクトレベルのプライバシーが適用されます。セッションの再開を有効にすると、テキスト、動画、音声プロンプトデータやモデル出力などのキャッシュに保存されたデータを最長 24 時間保持できるため、ユーザーは 24 時間以内に以前のセッションに再接続できます。データの保持をゼロにするには、この機能を有効にしないでください。

セッション再開機能を有効にするには、BidiGenerateContentSetup メッセージの session_resumption フィールドを設定します。有効にすると、サーバーはキャッシュに保存されている現在のセッションコンテキストのスナップショットを定期的に取得し、内部ストレージに保存します。スナップショットが正常に取得されると、ハンドル ID とともに resumption_update が返されます。このハンドル ID は、後でスナップショットからセッションを再開するために記録して使用できます。

セッション再開機能を有効にしてハンドル ID 情報を収集する例を次に示します。

Gen AI SDK for Python

# Set model generation_config
CONFIG = {"response_modalities": ["TEXT"]}

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {bearer_token[0]}",
}

# Connect to the server
async with connect(SERVICE_URL, additional_headers=headers) as ws:
    # Setup the session
    await ws.send(
        json.dumps(
            {
                "setup": {
                    "model": "gemini-2.0-flash-live-preview-04-09",
                    "generation_config": CONFIG,
                    # Enable session resumption.
                    "session_resumption": {},
                }
            }
        )
    )

    # Receive setup response
    raw_response = await ws.recv(decode=False)
    setup_response = json.loads(raw_response.decode("ascii"))

    # Send text message
    text_input = "Hello? Gemini are you there?"
    display(Markdown(f"**Input:** {text_input}"))

    msg = {
        "client_content": {
            "turns": [{"role": "user", "parts": [{"text": text_input}]}],
            "turn_complete": True,
        }
    }

    await ws.send(json.dumps(msg))

    responses = []
    handle_id = ""

    turn_completed = False
    resumption_received = False

    # Receive chucks of server response,
    # wait for turn completion and resumption handle.
    async for raw_response in ws:
        response = json.loads(raw_response.decode())

        server_content = response.pop("serverContent", None)
        resumption_update = response.pop("sessionResumptionUpdate", None)

        if server_content is not None:
          model_turn = server_content.pop("modelTurn", None)
          if model_turn is not None:
              parts = model_turn.pop("parts", None)
              if parts is not None:
                  responses.append(parts[0]["text"])

          # End of turn
          turn_complete = server_content.pop("turnComplete", None)
          if turn_complete:
            turn_completed = True

        elif resumption_update is not None:
          handle_id = resumption_update['newHandle']
          resumption_received = True
        else:
          continue

        if turn_complete and resumption_received:
          break

    # Print the server response
    display(Markdown(f"**Response >** {''.join(responses)}"))
    display(Markdown(f"**Session Handle ID >** {handle_id}"))

前のセッションを再開する場合は、setup.session_resumption 構成の handle フィールドを、以前に記録したハンドル ID に設定します。

Gen AI SDK for Python

# Set model generation_config
CONFIG = {"response_modalities": ["TEXT"]}

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {bearer_token[0]}",
}

# Connect to the server
async with connect(SERVICE_URL, additional_headers=headers) as ws:
    # Setup the session
    await ws.send(
        json.dumps(
            {
                "setup": {
                    "model": "gemini-2.0-flash-live-preview-04-09",
                    "generation_config": CONFIG,
                    # Enable session resumption.
                    "session_resumption": {
                        "handle": handle_id,
                    },
                }
            }
        )
    )

    # Receive setup response
    raw_response = await ws.recv(decode=False)
    setup_response = json.loads(raw_response.decode("ascii"))

    # Send text message
    text_input = "What was the last question I asked?"
    display(Markdown(f"**Input:** {text_input}"))

    msg = {
        "client_content": {
            "turns": [{"role": "user", "parts": [{"text": text_input}]}],
            "turn_complete": True,
        }
    }

    await ws.send(json.dumps(msg))

    responses = []
    handle_id = ""

    turn_completed = False
    resumption_received = False

    # Receive chucks of server response,
    # wait for turn completion and resumption handle.
    async for raw_response in ws:
        response = json.loads(raw_response.decode())

        server_content = response.pop("serverContent", None)
        resumption_update = response.pop("sessionResumptionUpdate", None)

        if server_content is not None:
          model_turn = server_content.pop("modelTurn", None)
          if model_turn is not None:
              parts = model_turn.pop("parts", None)
              if parts is not None:
                  responses.append(parts[0]["text"])

          # End of turn
          turn_complete = server_content.pop("turnComplete", None)
          if turn_complete:
            turn_completed = True

        elif resumption_update is not None:
          handle_id = resumption_update['newHandle']
          resumption_received = True
        else:
          continue

        if turn_complete and resumption_received:
          break

    # Print the server response
    # Expected answer: "You just asked if I was there."
    display(Markdown(f"**Response >** {''.join(responses)}"))
    display(Markdown(f"**Session Handle >** {resumption_update}"))

セッションをシームレスに再開するには、透過モードを有効にします。

Gen AI SDK for Python

await ws.send(
        json.dumps(
            {
                "setup": {
                    "model": "gemini-2.0-flash-live-preview-04-09",
                    "generation_config": CONFIG,
                    # Enable session resumption.
                    "session_resumption": {
                        "transparent": True,
                    },
                }
            }
        )
    )

透過モードを有効にすると、コンテキストスナップショットに一致するクライアントメッセージのインデックスが明示的に返されます。これは、再開ハンドルからセッションを再開するときに、再送信する必要があるクライアントメッセージを特定するのに役立ちます。

関数呼び出しを使用する

関数呼び出しを使用して関数の説明を作成し、その説明をリクエストでモデルに渡すことができます。モデルからのレスポンスには、説明に対応する関数の名前と、その関数を呼び出す引数が含まれます。

すべての関数は、setup メッセージの一部としてツール定義を送信することで、セッションの開始時に宣言する必要があります。

Gen AI SDK for Python

# Set model generation_config
CONFIG = {"response_modalities": ["TEXT"]}

# Define function declarations
TOOLS = {
    "function_declarations": {
        "name": "get_current_weather",
        "description": "Get the current weather in the given location",
        "parameters": {
            "type": "OBJECT",
            "properties": {"location": {"type": "STRING"}},
        },
    }
}

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {bearer_token[0]}",
}

# Connect to the server
async with connect(SERVICE_URL, additional_headers=headers) as ws:
    # Setup the session
    await ws.send(
        json.dumps(
            {
                "setup": {
                    "model": "gemini-2.0-flash-live-preview-04-09",
                    "generation_config": CONFIG,
                    "tools": TOOLS,
                }
            }
        )
    )

    # Receive setup response
    raw_response = await ws.recv(decode=False)
    setup_response = json.loads(raw_response.decode())

    # Send text message
    text_input = "Get the current weather in Santa Clara, San Jose and Mountain View"
    display(Markdown(f"**Input:** {text_input}"))

    msg = {
        "client_content": {
            "turns": [{"role": "user", "parts": [{"text": text_input}]}],
            "turn_complete": True,
        }
    }

    await ws.send(json.dumps(msg))

    responses = []

    # Receive chucks of server response
    async for raw_response in ws:
        response = json.loads(raw_response.decode("UTF-8"))

        if (tool_call := response.get("toolCall")) is not None:
            for function_call in tool_call["functionCalls"]:
                responses.append(f"FunctionCall: {str(function_call)}\n")

        if (server_content := response.get("serverContent")) is not None:
            if server_content.get("turnComplete", True):
                break

    # Print the server response
    display(Markdown("**Response >** {}".format("\n".join(responses))))

コード実行を使用する

Live API でコード実行を使用すると、Python コードを直接生成して実行できます。

Gen AI SDK for Python

# Set model generation_config
CONFIG = {"response_modalities": ["TEXT"]}

# Set code execution
TOOLS = {"code_execution": {}}

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {bearer_token[0]}",
}

# Connect to the server
async with connect(SERVICE_URL, additional_headers=headers) as ws:
    # Setup the session
    await ws.send(
        json.dumps(
            {
                "setup": {
                    "model": "gemini-2.0-flash-live-preview-04-09",
                    "generation_config": CONFIG,
                    "tools": TOOLS,
                }
            }
        )
    )

    # Receive setup response
    raw_response = await ws.recv(decode=False)
    setup_response = json.loads(raw_response.decode())

    # Send text message
    text_input = "Write code to calculate the 15th fibonacci number then find the nearest palindrome to it"
    display(Markdown(f"**Input:** {text_input}"))

    msg = {
        "client_content": {
            "turns": [{"role": "user", "parts": [{"text": text_input}]}],
            "turn_complete": True,
        }
    }

    await ws.send(json.dumps(msg))

    responses = []

    # Receive chucks of server response
    async for raw_response in ws:
        response = json.loads(raw_response.decode("UTF-8"))

        if (server_content := response.get("serverContent")) is not None:
            if (model_turn:= server_content.get("modelTurn")) is not None:
              if (parts := model_turn.get("parts")) is not None:
                if parts[0].get("text"):
                    responses.append(parts[0]["text"])
                for part in parts:
                    if (executable_code := part.get("executableCode")) is not None:
                        display(
                            Markdown(
                                f"""**Executable code:**
```py
{executable_code.get("code")}
```
                            """
                            )
                        )
            if server_content.get("turnComplete", False):
                break

    # Print the server response
    display(Markdown(f"**Response >** {''.join(responses)}"))

Google 検索によるグラウンディングを使用する

google_search を使用して Live API で Google 検索によるグラウンディングを使用できます。

Gen AI SDK for Python

# Set model generation_config
CONFIG = {"response_modalities": ["TEXT"]}

# Set google search
TOOLS = {"google_search": {}}

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {bearer_token[0]}",
}

# Connect to the server
async with connect(SERVICE_URL, additional_headers=headers) as ws:
    # Setup the session
    await ws.send(
        json.dumps(
            {
                "setup": {
                    "model": "gemini-2.0-flash-live-preview-04-09",
                    "generation_config": CONFIG,
                    "tools": TOOLS,
                }
            }
        )
    )

    # Receive setup response
    raw_response = await ws.recv(decode=False)
    setup_response = json.loads(raw_response.decode())

    # Send text message
    text_input = "What is the current weather in San Jose, CA?"
    display(Markdown(f"**Input:** {text_input}"))

    msg = {
        "client_content": {
            "turns": [{"role": "user", "parts": [{"text": text_input}]}],
            "turn_complete": True,
        }
    }

    await ws.send(json.dumps(msg))

    responses = []

    # Receive chucks of server response
    async for raw_response in ws:
        response = json.loads(raw_response.decode())
        server_content = response.pop("serverContent", None)
        if server_content is None:
            break

        model_turn = server_content.pop("modelTurn", None)
        if model_turn is not None:
            parts = model_turn.pop("parts", None)
            if parts is not None:
                responses.append(parts[0]["text"])

        # End of turn
        turn_complete = server_content.pop("turnComplete", None)
        if turn_complete:
            break

    # Print the server response
    display(Markdown("**Response >** {}".format("\n".join(responses))))

制限事項

Live API の現在の制限事項の一覧については、リファレンスドキュメントの Gemini Live API の制限事項のセクションをご覧ください。

料金

詳細については、料金ページをご覧ください。

詳細

WebSocket API リファレンスなど、Live API の詳細については、Gemini API のドキュメントをご覧ください。

Gemini Live API コレクションでコンテンツを整理 必要に応じて、コンテンツの保存と分類を行います。

機能

Live API を使用する

テキストを送信して音声を受信する

Gen AI SDK for Python

テキストの送受信

Gen AI SDK for Python

インストール

音声を送信する

Gen AI SDK for Python

音声文字起こし

Gen AI SDK for Python

音声と言語の設定を変更する

Gen AI SDK for Python

Console

会話をストリーミングする

Gen AI SDK for Python

Console

セッションの長さ

コンテキスト ウィンドウ

同時セッション数

セッション中にシステムの手順を更新する

音声アクティビティ検出の設定を変更する

セッションの再開を有効または無効にする

Gen AI SDK for Python

Gen AI SDK for Python

Gen AI SDK for Python

関数呼び出しを使用する

Gen AI SDK for Python

コード実行を使用する

Gen AI SDK for Python

Google 検索によるグラウンディングを使用する

Gen AI SDK for Python

制限事項

料金

詳細

Gemini Live API
コレクションでコンテンツを整理必要に応じて、コンテンツの保存と分類を行います。

コンテキストウィンドウ