Async API

Asynchronous APIs (Async API for short) are supported for long running methods. Take audio transcrption as an example, if you wanted to transcribe an hour long audio file then doing a synchronous API call is not viable. You could break the audio file into smaller chunks, say 1 min chunk and then use the synchronous API. On the other hand this is a perfect use case to simply use the Async API. Async API gives you a way to submit a request and then poll for the request to finish.

There are 4 steps in an Async API
  1. Get Payload Presigned URL

  2. Upload Payload

  3. Submit Async request

  4. Check Async request

Get Payload Presigned URL

API endpoint: https://api.tiyaro.ai/v1/input/upload-url

(See following sample code and API documentation for more details on the input params and output.)

This is the first step of sending an Async API request. Use this method to get a presigned URL that you can use to upload your payload. e.g. your audio file.

Upload Payload

In this step you upload your payload to the url that you get from the above call. Since this is a presgined URL you can use any utility to upload your payload file to this presigned URL. See the sample code below to see how you can upload a file in Python

Submit Async request

After you have uploaded your payload. You invoke the Async API for the model that you want to call inference on. The async API endpoint for a model is available on its model card on the tiyaro console. Currently, the whisper AI models are the only ones supporting the Async API. For convenience we are listing those async API end points here

https://api.tiyaro.ai/v1/async/ent/tiyarofs/1/openai/whisper-large?serviceTier=gpuflexhttps://api.tiyaro.ai/v1/async/ent/tiyarofs/1/openai/whisper-medium?serviceTier=gpuflexhttps://api.tiyaro.ai/v1/async/ent/tiyarofs/1/openai/whisper-small?serviceTier=gpuflexhttps://api.tiyaro.ai/v1/async/ent/tiyarofs/1/openai/whisper-tiny?serviceTier=gpuflex

See the sample code below and the API documentation below for details on the input parameters and output from this request. This particular method returns a ‘GET’ url for you to poll for the results of this operation.

Check Async request

You can then poll on the ‘GET’ url returned from the above async request to check if the request is completed. The ‘status’ fields in the response can have one of the following states.

  • accepted - The request has been accepted in the system

  • pending - The request is queued for execution

  • processing - The request is being processed

  • success - The request finished successfully

  • failed - The request failed

  • cancelled - The request was cancelled.

Note: When the status is ‘success’, ths same method also returns the ‘results’ of your request. Again, check the following sample code and API spec for details of the fields.

Note

The following sample code is also available in our sample code repo. You can simply clone that repo and run the fully working sample code.

Sample Python code calling Async API

import requests
import json
import os
import sys
import time

PROD_BASE = 'https://api.tiyaro.ai'
WHISPER_LARGE_ASYNC_PATH = f'{PROD_BASE}/v1/async/ent/tiyarofs/1/openai/whisper-large?serviceTier=gpuflex'
MP3_UPLOAD_URL = f'{PROD_BASE}/v1/input/upload-url?extension=mp3'


def getHeaders():
    api_key = os.environ.get('TIYARO_API_KEY')
    return {
        'Content-Type': 'application/json',
        'Authorization': f'Bearer {api_key}'
    }


def whisper_input():
    return {
        "no_speech_threshold": 0.6,
        "patience": 1,
        "suppress_tokens": "-1",
        "compression_ratio_threshold": 2.4,
        #
        # NOTE Remove 'language' parame if you want native language
        #
        "language": "en",
        "temperature_increment_on_fallback": 0.2,
        "length_penalty": None,
        "logprob_threshold": -1,
        "condition_on_previous_text": True,
        "initial_prompt": None,
        "task": "transcribe",
        "temperature": 0,
        "beam_size": 5,
        "best_of": 5
    }


def get_upload_url(extension='mp3'):
    resp = requests.request("POST", MP3_UPLOAD_URL,
                            json={}, headers=getHeaders())
    assert resp.status_code == 201
    result = json.loads(resp.text)
    uploadURL = result['uploadUrl']['PUT']
    print('-- Input payload_url --', uploadURL)
    return uploadURL


def upload_mp3_to_url(mp3File, upload_url):
    resp = requests.request("PUT", upload_url, data=open(mp3File, 'rb'))
    assert resp.status_code == 200
    print(f'-- {mp3File} uploaded --')


def send_async_infer_request(upload_url):
    modelURL = WHISPER_LARGE_ASYNC_PATH

    payload = {
        "input": whisper_input(),
        "URL": upload_url
    }

    resp = requests.post(modelURL, headers=getHeaders(), json=payload)
    assert resp.status_code == 202
    print('-- async request submitted --')
    result = json.loads(resp.text)
    request_id = result['response']['id']
    print(f'requestId: {request_id}')
    return result['response']['urls']['GET']


def check_status_and_result(inference_result_url):
    status = "NA"
    result = None
    while True:
        resp = requests.request(
            "GET", inference_result_url, headers=getHeaders())
        assert resp.status_code == 200
        result = json.loads(resp.text)
        status = result["status"]
        if status == 'success':
            print("status: ", status)
            break
        print("status: ", status)
        time.sleep(15)
    print(json.dumps(result, indent=2))
    text = result["result"]["text"]
    print("-- Transcribed Text --\n", text)
    print("-- Done -- \n")


def async_infer(input_mp3):
    # Step 1 - Get a presigned url to upload your audio file
    upload_url = get_upload_url()

    # Step 2 - Upload your mp3 file to the presinged url
    upload_mp3_to_url(input_mp3, upload_url)

    # Step 3 - Submit an Async request. You get a inference_result_url that you can poll on.
    inference_result_url = send_async_infer_request(upload_url)

    # Step 4 - Poll/Wait for request to finish
    check_status_and_result(inference_result_url)


def main():
    api_key = os.environ.get('TIYARO_API_KEY')
    if not api_key:
        raise ValueError("TIYARO_API_KEY not set")

    if len(sys.argv) != 2:
        print("Usage: asyncWhisper.py <mp3_file>")
        sys.exit(1)

    input_mp3 = sys.argv[1]
    print("-- processing input file --", input_mp3)

    start = time.time()
    async_infer(input_mp3)

    print("\n--- Inference time:", round(time.time() - start, 2), "secs ---")


if __name__ == "__main__":
    main()

Async API

GET /async_request_specific_poll_url
Status Codes
  • 200 OK – Success

  • default – Unexpected error

Response JSON Object
  • response.acceptedAt (string) – DateTime when the request was accepted (required)

  • response.completedAt (string) – DateTime when the request finished processing (required)

  • response.id (string) – ID of async request (required)

  • response.model (string) – Model name (required)

  • response.result.placholder (string) – This is just a placeholder, each model will have its own model specific results. Please refer to the API (sync) documentation of the model to find out the results returned by that model. (required)

  • response.startedAt (string) – DateTime when the request processing started (required)

  • response.status (string) – Status of the request (required)

POST /model_specific_async_infer_endpoint
Query Parameters
  • serviceTier (string) – Available service tiers: [cpuflex(default), gpuflex]

Request JSON Object
  • URL (string) – The presigned URL where the payload was uploaded. This is the same URL that was returned by the upload-url method (required)

  • input.placholder (string) – This is just a placeholder, each model will have its own model specific input params. Please refer to the API (sync) documentation of the model to find out the parameters that are accepted by that specific model. (required)

Status Codes
  • 200 OK – Success

  • default – Unexpected error

Response JSON Object
  • response.response.acceptedAt (string) – DateTime when the request was accepted (required)

  • response.response.completedAt (string) – DateTime when the request finished processing (required)

  • response.response.id (string) – ID of async request (required)

  • response.response.model (string) – Model name (required)

  • response.response.result (string) – None (required)

  • response.response.startedAt (string) – DateTime when the request processing started (required)

  • response.response.status (string) – Status of the request (required)

  • response.response.urls.GET (string) – The presigned URL where you can poll for you async result completiong

POST /v1/input/upload-url
Query Parameters
  • extension (string) – Extension of the uploaded payload. Sample values ‘mp3’ for audio.

Status Codes
  • 200 OK – Success

  • default – Unexpected error

Response JSON Object
  • response.uplaodUrl.PUT (string) – The presigned URL where you can PUT(upload) your payload