noticed previous sql results doesn’t have tensorflow, wired and found
the old githubarchive:github.timeline seems deprecated.
and new dataset is very large. probably about 2TB
ran a simple command
SELECT * FROM [githubarchive:year.2015] WHERE type="WatchEvent" LIMIT 1
bytes processed: 488 GB
this is better:
SELECT repo_name, count(*) FROM [githubarchive:month.201601] WHERE type="WatchEvent" group by 1 order by 2 desc;
1 month data, but still costs about 1GB. very expensive.