mirror of
https://github.com/twitter/the-algorithm.git
synced 2025-01-02 23:51:53 +01:00
Fixes #535
This commit is contained in:
parent
997ee6b63a
commit
b1bc8b2541
@ -15,7 +15,7 @@ SimClusters from the Linear Algebra Perspective discussed the difference between
|
|||||||
However, calculating the cosine similarity between two Tweets is pretty expensive in Tweet candidate generation. In TWISTLY, we scan at most 15,000 (6 source tweets * 25 clusters * 100 tweets per clusters) tweet candidates for every Home Timeline request. The traditional algorithm needs to make API calls to fetch 15,000 tweet SimCluster embeddings. Consider that we need to process over 6,000 RPS, it’s hard to support by the existing infrastructure.
|
However, calculating the cosine similarity between two Tweets is pretty expensive in Tweet candidate generation. In TWISTLY, we scan at most 15,000 (6 source tweets * 25 clusters * 100 tweets per clusters) tweet candidates for every Home Timeline request. The traditional algorithm needs to make API calls to fetch 15,000 tweet SimCluster embeddings. Consider that we need to process over 6,000 RPS, it’s hard to support by the existing infrastructure.
|
||||||
|
|
||||||
|
|
||||||
## SimClusters Approximate Cosine Similariy Core Algorithm
|
## SimClusters Approximate Cosine Similarity Core Algorithm
|
||||||
|
|
||||||
1. Provide a source SimCluster Embedding *SV*, *SV = [(SC1, Score), (SC2, Score), (SC3, Score) …]*
|
1. Provide a source SimCluster Embedding *SV*, *SV = [(SC1, Score), (SC2, Score), (SC3, Score) …]*
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user