open-ai-api embedding request runs asynchronously and is parrallelized for multiple requests.
We can now take a token (a token = a word or a sentence or a large paragraph) and get the full embedding (refer sometimes as encoding) via the openAI API with “encodeToken”. If we want to do this for several tokens, we can do it in parallel with “encodeTokensInParallel”.
I think the next steps are:
- modify the SNLPEncoder (upper class of the async encoder) to remove the useless “encodeSentence” and add “encodeTokensInParallel” instead.
- modify the DictionaryCorpus and make it asynchronous or simply create an async DictionaryCorpus, otherwise it's impossible to use it with an asynchronous encoder.
Edited by Jean Nordmann