Syllable Stress Assessment Agent

Read a paragraph aloud to evaluate your noun and verb word stress pronunciation.

Read this paragraph:

native exemplar Not recording.

Developers: A2A usage

Agent Card URL: /.well-known/agent.json
A2A JSON-RPC endpoint: /a2a (methods: agent.about, paragraphs.count, paragraphs.get_text, pronunciation.evaluate)
Health endpoint: /healthz
request_id: -

Example request: paragraphs.count

Use this JSON-RPC request to discover how many practice paragraphs are configured. It takes no parameters (an empty params object) and is useful for dynamically building a selector or validating configuration in a client integration. Interpret the result by reading the returned count value to determine the valid paragraph ID range for subsequent calls.

{
  "jsonrpc": "2.0",
  "id": "p-count",
  "method": "paragraphs.count",
  "params": {}
}

Example request: paragraphs.get_text

Use this request to fetch a specific paragraph by ID before recording or analysis. The required parameter is paragraph_id, which must be a valid numeric paragraph identifier. Call this when you need the canonical prompt text in a client app, and interpret the response as the exact reference text expected by the evaluator for that paragraph.

{
  "jsonrpc": "2.0",
  "id": "p-text-3",
  "method": "paragraphs.get_text",
  "params": {
    "paragraph_id": 3
  }
}

Example request: pronunciation.evaluate

This is the main scoring request for stress-shift pronunciation analysis. Provide audio_wav_base64 (required) with 16 kHz mono WAV audio, plus either paragraph_id or paragraph text context; native_exemplar is optional and can influence output handling (when true, persisted bucket filenames use an e suffix before .wav/.json). Optionally include deepgram_api_key to override server-side key selection for that request. Use this after capturing audio, then interpret the returned analysis fields (alignment, target statuses, and summary score) to determine pronunciation accuracy and feedback.

{
  "jsonrpc": "2.0",
  "id": "p3-a2a-demo",
  "method": "pronunciation.evaluate",
  "params": {
    "paragraph_id": 3,
    "audio_wav_base64": "BASE64_16KHZ_MONO_WAV",
    "native_exemplar": false,
    "deepgram_api_key": "dg_live_xxx"
  }
}

Example response

This sample shows the structure you receive after a successful pronunciation.evaluate call. The analysis block contains transcription confidence, token alignment, per-target status, and aggregate scoring, while persistence contains generated IDs and output file locations. Use this shape to map results into UI components and interpret success by checking both the summary score and individual target statuses.

{
  "jsonrpc": "2.0",
  "id": "p3-a2a-demo",
  "result": {
    "request_id": "b0f8f5b8-6f6d-4d73-bac2-89f8bf7e0f91",
    "analysis": {
      "deepgram_words": [
        {"word": "They", "start": 0.42, "end": 0.58, "confidence": 0.99},
        {"word": "record", "start": 0.58, "end": 1.02, "confidence": 0.97},
        {"word": "the", "start": 1.02, "end": 1.12, "confidence": 0.98},
        {"word": "project", "start": 1.12, "end": 1.51, "confidence": 0.96}
      ],
      "alignment": [
        {"ref_token": "record", "hyp_word": "record", "confidence": 0.97},
        {"ref_token": "project", "hyp_word": "project", "confidence": 0.96}
      ],
      "targets": [
        {
          "word": "record",
          "pos": "verb",
          "expected_stress": 2,
          "inferred_stress": 2,
          "status": "ok",
          "feedback": "Verb stress is correct.",
          "decision_method": "learned_threshold",
          "duration_ratio_log": -0.221,
          "learned_threshold": -0.103,
          "decision_confidence": 87.4,
          "threshold_stats": {"mu1": 0.164, "sigma1": 0.082, "mu2": -0.241, "sigma2": 0.077, "threshold": -0.103}
        },
        {
          "word": "project",
          "pos": "noun",
          "expected_stress": 1,
          "inferred_stress": 2,
          "status": "needs_work",
          "feedback": "Noun should stress the first syllable.",
          "decision_method": "naive_duration"
        }
      ],
      "score_summary": {
        "percent_correct": 50.0,
        "correct_targets": 1,
        "total_targets": 2
      }
    },
    "persistence": {
      "recording_id": "250227114501123456e",
      "wav_path": "gs://syllable-stress/250227114501123456e.wav",
      "json_path": "gs://syllable-stress/250227114501123456e.json"
    }
  }
}

Example curl: model card discovery

Run this command to retrieve the Agent Card metadata from the current deployment URL. It takes no request body or custom parameters and is typically used first to confirm capabilities, method names, and endpoint URLs before integrating. Interpret the pretty-printed JSON to verify protocol version, supported methods, and service metadata.

curl -s "http://guildaidemo.talknicer.com/.well-known/agent.json" | jq .

Example curl: paragraph count

This curl command sends the same paragraphs.count JSON-RPC request from a shell for quick diagnostics. It posts to /a2a with a JSON body and no method-specific params. Use it in scripts, CI health checks, or manual smoke tests, then read the returned count to confirm expected paragraph configuration.

curl -X POST "http://guildaidemo.talknicer.com/a2a" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":"p-count","method":"paragraphs.count","params":{}}'

Example curl: plain paragraph text

This command requests canonical prompt text for one paragraph ID using paragraphs.get_text. The key parameter is paragraph_id, and you would use it when preparing reading prompts or validating that a client and server are synchronized on paragraph content. Interpret the response text as the authoritative source for that paragraph.

curl -X POST "http://guildaidemo.talknicer.com/a2a" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":"p-text-3","method":"paragraphs.get_text","params":{"paragraph_id":3}}'

Example curl: paragraph 3 evaluation

Use this command-line example to submit audio for full pronunciation scoring against paragraph 3. The required payload field is audio_wav_base64; include paragraph context via paragraph_id (as shown) when evaluating predefined content. Interpret the resulting JSON by examining per-word target statuses and score_summary to assess overall performance.

curl -X POST "http://guildaidemo.talknicer.com/a2a" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":"p3-a2a-demo","method":"pronunciation.evaluate","params":{"paragraph_id":3,"audio_wav_base64":"BASE64_16KHZ_MONO_WAV"}}'

Bring Your Own API Key

The shared Deepgram API key used by this demo will probably not last forever. To ensure uninterrupted access, you can provide your own — Deepgram offers $200 in free credits with no credit card required. Sign up for Deepgram and generate an API key.

Your Deepgram API Key:

Using shared API key