What kinds of things should I track?
Everything! But to start out we recommend picking just 1-4 behaviors - a single key metric that you would bet your platform on (for Amazon this would be item purchases) and 2-3 other common behaviors that allow you to more robustly gauge users' tastes (things like clicks, views, likes, shares, etc.).
How do I figure out what a behavior’s desirability should be?
The desirability of a behavior is determined upon creation, and represents how desirable it is for users to perform the behavior. This depends on what your platform’s objectives are, and what you think your users preferences are. For example, a ‘like’ is generally good, but it may be less desirable than a ‘purchase’. When you decide what your behaviors’ desirability values will be, keep in mind that they are assessed relative to each other, so setting the desirability of ‘like’ to 0.5 and ‘purchase’ to 1.0 is equivalent to setting them to 0.25 and 0.5, respectively.
What is the difference between a behavior’s desirability
and an event’s value?
The value supplied when you track an event is only relevant to that particular event – it represents how much of the event occurred (how long did the user perform the ‘watch’ behavior on that movie, what percent of the article did they ‘read’, etc.) – and is not directly related to the behavior itself.
I have different types of items, should I put them all in the same engine?
Short answer: definitely.
Longer answer: you probably want to recommend different types of items in different contexts. Fortunately, this is really easy – items can be given properties and tags, and every property and tag can be used to filter recommendations. For example you might give every item a 'type' property, and then supply a filter argument when you request recommendations that only returns items where type == 'fission reactor' (if like, you sold power plants).
How do I fill in gaps between when I create my engine and when I begin streaming real time events?
Mark down the time when you begin tracking events, as well as the last timestamp in the events dataset you used to seed your engine, then use Event Batch to stream the events that occurred between those two times.
If you do not know these values precisely, or if you set the 'created' field for your events manually and believe you may have streamed events within the period you want to fill in, you can use the Event Retrieve method's 'created_since' and 'created_before' fields to scrape all events over that period to find the gaps.