Microsoft’s Azure OpenAI Language Model

Sire Alwell

1 year ago

With technologies like Azure’s Cognitive Services offering API-based access to pre-trained models, modern machine learning and AI research have quickly moved from the lab to our IDEs. One of the more promising methods for working with language is a method called generative pretraining, or GPT, which handles large amounts of text. There are many different ways to deliver AI services.

Adding OpenAI to Azure
The Azure servers have been used to develop and train the generative AI tools from OpenAI. OpenAI’s tools are made available as part of Azure, with Azure-specific APIs and integration with Azure’s billing systems, as part of a long-standing agreement between OpenAI and Microsoft. The Azure OpenAI suite of APIs, which previously only supported the private preview of GPT-3 text generation and the Codex code model, is now generally available. In a future release, Microsoft has promised to provide DALL-E image creation.

This does not imply that anyone can create an application that makes use of GPT-3; rather, Microsoft continues to restrict access to ensure that projects adhere to its ethical AI usage principles and are narrowly focused on particular use cases. Access to Azure OpenAI is also restricted to direct Microsoft customers. For access to its Limited Access Cognitive Services, where there is a chance of impersonation or privacy violations, Microsoft applies a similar procedure.

Azure’s Token
Token-based pricing is a crucial component of OpenAI models. Instead of the usual authentication token, Azure OpenAI tokens are tokenized chunks of strings that are generated by an internal statistical model. To assist you to understand how your queries are charged, Open AI offers a tool on its website that displays how strings are tokenized. Though a token may be less or more than four characters, you can anticipate that it will require around 100 tokens for 75 words (roughly a paragraph of normal text).

The price of the tokens increases with model complexity. Ada, the entry-level model, costs approximately $0.0004 per 1,000 tokens, while Davinci, the top model, costs $0.02. Applying your tuning incurs a storage cost, and employing embeddings might result in costs that are orders of magnitude greater due to the increasing compute demands. Model fine-tuning comes with additional expenses starting at $20 for each compute hour. The Azure website lists sample charges, but actual costs may differ based on your organization’s Microsoft account status.