CoreML models can be registered and used automatically with the CoreMLEncoder. To add a new model:
1. Add the .mlpackage file to ```Sources/SwiftNLP/Resources```.
2. From the ```swiftnlp``` directory, run the following from the command line:
-```xcrun coremlcompiler generate Sources/SwiftNLP/Resources/[name of mlpackage file]/ --language Swift Sources/SwiftNLPGenericLLMMacros```
This should generate a .swift file.
3. Add the following script steps to the ```.compile_models``` item in ```.gitlab-ci.yml``` file:
-```xcrun coremlcompiler compile Sources/SwiftNLP/Resources/[name of mlpackage file]/ Sources/SwiftNLP/Models```
-```xcrun coremlcompiler generate Sources/SwiftNLP/Resources/[name of mlpackage file]/ --language Swift Sources/SwiftNLP/Resources```
-```mv Sources/SwiftNLP/Resources/[name of the generated swift file] Sources/SwiftNLP/2.\ Encoding```
4. Navigate to ```Sources/SwiftNLPGenericLLMMacros/ModelClasses.swift```.
- Add the following to the `LLM_MODEL_CLASSES` map:
```
"[string name of model]": [
LLMModelClassesKey.Input: [name of model input class].self,
LLMModelClassesKey.Output: [name of model output class].self,
LLMModelClassesKey.Model: [name of model class].self,
LLMModelClassesKey.FeatureName: [name of feature to use],
LLMModelClassesKey.URL: "[name of ml package file without .mlpackage].mlmodelc",
LLMModelClassesKey.InputDimension: [input size]
]
```
- Notes on how to perform the above step:
- [string name of model] can be any string you want.
- the [name of model input class], [name of model output class], and [name of model class] can all be retrieved from the generated swift file. They are the names of the classes present in that file.
- the name of the feature to use is present in the model output class. Typically, it's `embeddings` though some models may have other fields like `pooler_output`.
- the [input size] can be retrieved from the auto-generated documentation in the model input class.
5. Build the project, and the model can now be used anywhere. To access the model, instantiata an `LLMEmbedings` object with the model's string identifier. If using the CoreMLEncoder, set its `model` field to the model's string identifier.