Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training of model #52

Open
magnuskarlssonhm opened this issue Jul 4, 2020 · 5 comments
Open

Training of model #52

magnuskarlssonhm opened this issue Jul 4, 2020 · 5 comments

Comments

@magnuskarlssonhm
Copy link

magnuskarlssonhm commented Jul 4, 2020

I tried to use the catalog.xlsx file and the three csv's with timestamps in one folder but got errors. Catalog
image
Usage
image

Postman request

image

Postman response
{
"id": "922792be-1add-485f-af51-07513ad8cb88",
"description": "Simple recommendations model",
"creationTime": "2020-07-04T21:43:03.9316674Z",
"modelStatus": "Failed",
"modelStatusMessage": "Failed to parse catalog file or parsing found no valid items",
"parameters": {
"blobContainerName": "trainingdata",
"catalogFileRelativePath": "catalogs/catalog.xlsx",
"usageRelativePath": "usage2",
"supportThreshold": 6,
"cooccurrenceUnit": "User",
"similarityFunction": "Jaccard",
"enableColdItemPlacement": true,
"enableColdToColdRecommendations": false,
"enableUserAffinity": true,
"enableUserToItemRecommendations": false,
"allowSeedItemsInRecommendations": true,
"enableBackfilling": true,
"decayPeriodInDays": 30
},
"statistics": {
"totalDuration": "00:00:00",
"trainingDuration": "00:00:00",
"catalogParsing": {
"duration": "00:00:00.0295404",
"errors": [
{
"count": 65,
"error": "MissingFields",
"sample": {
"file": "catalogs/catalog.xlsx",
"line": 1
}
},
{
"count": 34,
"error": "IllegalCharactersInItemId",
"sample": {
"file": "catalogs/catalog.xlsx",
"line": 5
}
},
{
"count": 1,
"error": "ItemIdTooLong",
"sample": {
"file": "catalogs/catalog.xlsx",
"line": 14
}
},
{
"count": 1,
"error": "MalformedLine",
"sample": {
"file": "catalogs/catalog.xlsx",
"line": 53
}
}
],
"successfulLinesCount": 0,
"totalLinesCount": 101
},
"numberOfCatalogItems": 0,
"numberOfUsageItems": 0,
"numberOfUsers": 0
}
}

Then I tried the other sample data.
Postman request

image

This succeeded to train but..
When asking for recommendations I get a bit too high numbers, something seems wrong here. What is the latest tested files and what setup is used for testing?
/api/models/fa2dd4a3-c783-4c0f-8a45-d801a2ee746b/recommend?itemId=2005018

image

@magnuskarlssonhm
Copy link
Author

Files used in first attempt
image

Files used in second attempt
image

@magnuskarlssonhm
Copy link
Author

So I managed to train a model by using the other files now. However, it is quite unclear to me what values go together when training the model?

{
"id": "e633750f-c872-4fc0-ac8b-ec6f58f5e39e",
"description": "Simple recommendations model",
"creationTime": "2020-07-05T19:58:33.11227Z",
"modelStatus": "Completed",
"modelStatusMessage": "Model Training Completed Successfully",
"parameters": {
"blobContainerName": "trainingdata",
"catalogFileRelativePath": "catalogs/catalog.xlsx",
"usageRelativePath": "usage2",
"supportThreshold": 6,
"cooccurrenceUnit": "User",
"similarityFunction": "Jaccard",
"enableColdItemPlacement": false,
"enableColdToColdRecommendations": false,
"enableUserAffinity": false,
"enableUserToItemRecommendations": true,
"allowSeedItemsInRecommendations": false,
"enableBackfilling": false,
"decayPeriodInDays": 30
},
"statistics": {
"totalDuration": "00:39:12.9432509",
"trainingDuration": "00:01:04.8441939",
"storingUserHistoryDuration": "00:23:23.9673149",
"usageEventsParsing": {
"duration": "00:15:34.3498716",
"errors": [
{
"count": 10,
"error": "IllegalCharactersInUserId",
"sample": {
"file": "usage2/usage1.csv",
"line": 1503672
}
}
],
"successfulLinesCount": 9993516,
"totalLinesCount": 9993526
},
"numberOfUsageItems": 2424,
"numberOfUsers": 6863299
}
}

@natinimni
Copy link
Contributor

Hi @magnuskarlssonhm I'm not sure what you're asking here. About the errors you got trying to use your catalog file - the response you pasted actually contains more details about the errors - I saw "MissingFields", "IllegalCharactersInItemId" and more errors in your catalog file - it also contains info about which lines failed.
You shared a screenshot showing the names of your files, not sure what was your intention there.
About your last question about the training values please refer to the API reference document

@magnuskarlssonhm
Copy link
Author

@natinimni Hi, thanks for responding. The screen shots are from the suggested example catalog and usage files referenced in the documentation.
image

@natinimni
Copy link
Contributor

Got it, I apologize. Consider using the files form the C# sample

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants