Note: Data management requires access to Dojo (only available to current customers).
Miso Answers engine can work with a variety of data.
- Your content catalog. This includes content that’s consumed by your readers, such as a news article or opinion piece. This is the data the Answers engine will use to answer user questions. As a best practice, teams usually send us the full catalog and then filter out what should be excluded from Answers. This helps us avoid uncertainties about what has or hasn’t been uploaded to Miso. You can continue to upload/delete data at any time; Miso does not charge for indexing content.
- You can also send transcripts for videos and audio files.
- (Optional) Databases and lists. This is useful when you may want the system to extract specific data values rather than combing through the content archive.
- e.g. profile pages about an investment fund, design agencies, supermarkets, hospitals, etc.
- e.g. ranked lists such as Top 50 or award winners
It is also possible to provide data about your site’s users and visitors, though this is not required. Since Miso is a privacy-first platform, many customers choose to provide hashed user data (and we ask that you please do not send any PII to Miso).
Data Upload Options
You have two options for providing data:
- Wordpress integration
- Upload via API
- Send data through SFTP
These methods will enable Miso to have a near real-time up-to-date sync with your catalog.
Integrating With Miso’s Data APIs
Miso’s Data APIs let you automatically upload and manage your data in Miso. These APIs all support high-throughput data ingestion through bulk insert and satisfy GDPR and CCPA compliance by letting users delete their data from Miso.
Uploading and Managing Your Product Catalog Data
Miso’s Product APIs let you upload, read, and delete Product records that represent your site’s content.
API Throughput
We recommend batching up your Product Upload API calls and sending around 100 records at a time to avoid timeout and memory risks.
Product Data Model Design
To fully optimize your Answers engine (and increase our ability to accommodate finetuning requirements), it is important to provide Miso with Product records that are complete and accurate. We define a set of common attributes that capture the basics of most content media products, such as title, description, categories, tags, authors, etc. You can also use custom_attributes to specify any additional information from your catalog. Miso can handle hundreds of custom attributes, so don’t feel reluctant to provide as much catalog metadata as you have.
How to Use Custom Attributes
If your products’ characteristics cannot be fully captured by the fields that Miso defines, you can also specify custom_attributes. The more complete the product information is, the smarter Miso becomes. For example, if your product summaries support multiple languages, you can have something like: custom_attributes.alternative_langs: ["en", "zh"].
You might also want to consider including attributes that are required for presenting summaries in the front end (e.g. cover image, raw rating scores etc.) so that you don’t need additional requests to fetch those fields.
It’s important to put all additional textual content into the description field. For example, if you have a Product Summary that’s broken into several sections, you should put the entire text contents in the description field rather than creating custom attributes for each section. That’s because Miso only analyzes the description and html fields as text to perform tokenization, topic modeling, etc. We analyze all the remaining fields as model features or keywords.
Sometimes the design of custom attributes can require some thought. Your Miso solutions architects are available to help you think through the engineering implications. For example, if you have products that are only available in certain regions, how should you represent that? There are a few options:
- Have a list like:
{"regional_availability": ["region_1", "region_3", ...]} that contains all the regions that currently have the product
- Have multiple attributes like:
{"region_1_availability": true, "region_2_availability": false, ....} , where each attribute represents the availability of a particular region.
For a “partial” update, option 2 might be easier, because you will only need to gather the availability for the specific regions you want to update.
Viewing Data
In Dojo, you can see the data that you’ve uploaded to Miso in a visual dashboard. See the Dojo Data Sets guide for more details.
You can also read your Product and User data via API.
Deleting Data
Dojo provides a way to delete data records and even wipe all the data in your environment. See the Dojo Data Sets guide for more details.
Programmatically, data can be deleted via API as well. Here is an example of how to delete data in bulk:
Get all product ids:
GET https://api.askmiso.com/v1/products/_ids
Body: NA
Response(200): {"data": {"ids": ["PRODUCT_1", "PRODUCT_2"]}}
Bulk delete products by ids:
POST https://api.askmiso.com/v1/products/_delete
Body: {"data": {"product_ids": ["PRODUCT_1", "PRODUCT_2"]}}
Response(200): {"message": "success"}
Troubleshooting
422 Errors (Schema Validation)
When a data upload fails, it is usually due to a schema (formatting) error.
Any schema error will cause the whole request to fail: the API will return status_code=422, and none of the records will be inserted. You should check the data field in the response to see where the errors are located. For example, the response below means there are schema errors in the interaction record at index 0:
{
"errors": true, // there are errors. please check!
"message": "None of the records were inserted because at least one of them contained schema errors. Please see the `data` field for details.",
"data": [
"data.0.product_ids is invalid. The attribute was expected to be of type ''array', 'null'' but type 'string' was given.",
"data.0.timestamp is invalid. The attribute should match the 'date-time' format."
]
}
We log schema errors on our end as well. If you ever need help troubleshooting, contact your Miso solutions engineer — we’re confident we can solve the problem together.
Incorrect Data
It’s a good idea to review your data in Dojo to make sure that it looks correct. If data has been uploaded in error, you can delete it, or overwrite the incorrect fields by uploading a new version. Miso automatically deduplicates records during upload and will keep the newest version.