This comment has been minimized.
This comment has been minimized.
+ |
This comment has been minimized.
This comment has been minimized.
May I ask? |
This comment has been minimized.
This comment has been minimized.
@vionemc The hosted zip file is automatically updated daily by alexa. |
This comment has been minimized.
This comment has been minimized.
Unfortunately the file no longer exists. |
This comment has been minimized.
This comment has been minimized.
Alexa has stopped offering this file (top-1m.csv.zip), you can now get the alternative free from Statvoo: |
This comment has been minimized.
This comment has been minimized.
Interestingly, the download works again today. I contacted the Alexa support and they said that the service is discontinued. I added a follow up question on why the service is available again and for how long. Let's see what happens there. The page that originally referenced the file (https://support.alexa.com/hc/en-us/articles/200461990-Can-I-get-a-list-of-top-sites-from-an-API-) does not link to the file anymore. |
This comment has been minimized.
This comment has been minimized.
Reply from Alexa: "The file is temporarily available again, yes. We'll post updates concerning the file to our FAQ and Twitter. We do not have any additional updates to share at this time." So need to monitor e.g. this: https://twitter.com/Alexa_Support |
This comment has been minimized.
This comment has been minimized.
The link @ao Does the link |
This comment has been minimized.
This comment has been minimized.
Is there a smaller CSV anywhere? Maybe top 50,000 or top 100,000 sites? |
This comment has been minimized.
This comment has been minimized.
CSV file is working again! Nice! |
This comment has been minimized.
This comment has been minimized.
OpenDNS has published a new top 1 million list here: http://s3-us-west-1.amazonaws.com/umbrella-static/index.html While the list is not composed in the same way we hope that it will be useful to some. Read up on the details here: https://blog.opendns.com/2016/12/14/cisco-umbrella-1-million/ |
This comment has been minimized.
This comment has been minimized.
Can confirm |
This comment has been minimized.
This comment has been minimized.
Thanks for this! |
This comment has been minimized.
This comment has been minimized.
@karan1149 |
This comment has been minimized.
This comment has been minimized.
@xdanx is that file still updated daily or is it from a specific date? |
This comment has been minimized.
This comment has been minimized.
the zip file has a timestamp of 5/7/2017 today, so it is at least generated each day. |
This comment has been minimized.
This comment has been minimized.
I need to generate a long list of URL-like names (>10million). Any idea? |
This comment has been minimized.
This comment has been minimized.
@doctorhy it's definitely not a place to ask it, go to stackoverflow.com and ask there. Not sure why not to google around how to generate a random string using a language of your choice in a first place. |
This comment has been minimized.
This comment has been minimized.
why do people need those domains ? |
This comment has been minimized.
This comment has been minimized.
It is definitely outdated. I have a site in the top million and it is not in the list. I would agree that the file is a few months old, 2-3 perhaps. |
This comment has been minimized.
This comment has been minimized.
Anybody know of a working solution? I'd prefer a script that I can run on my own to update my database. |
This comment has been minimized.
This comment has been minimized.
hello |
This comment has been minimized.
This comment has been minimized.
You might want to try this alternative from Cisco/OpenDNS http://s3-us-west-1.amazonaws.com/umbrella-static/index.html |
This comment has been minimized.
This comment has been minimized.
It seems that this link Why? Because https://www.alexa.com/siteinfo/cssminifier.com (a site I run) tells me it's 95,425 rank in the world, but the spreadsheet today is more like 50,000. I remember that the site was up there at some point but that was a few years ago! So, all in all, I dunno. And as an update for the code above, I created a dir in
Thanks to everyone else for also suggesting other alternatives that are kept up to date, even if they are slightly different. |
This comment has been minimized.
This comment has been minimized.
Does anybody know where I can get similar data with the metadata like website category etc.,? www.instagram.com => social |
This comment has been minimized.
This comment has been minimized.
can anybody help me with feature extraction process??? I don't have the knowledge to use python code for feature extraction...If you do, I'd be happy to get some help...thanks in advance |
This comment has been minimized.
This comment has been minimized.
This is JavaScript in nodejs, not Python. Perhaps you can google for an example in Python instead. |
This comment has been minimized.
This comment has been minimized.
Hi, there ➜ curl -I http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
HTTP/1.1 200 OK
x-amz-id-2: 0QFTzV4zLKRksLmC4JWG/iE/qVKQlSsr7m+lZbnRlrxocqsYbqgHMnjxlBuMTfWQhwrt7/NsULA=
x-amz-request-id: DA7628383A36A79D
Date: Sat, 22 Jun 2019 10:54:43 GMT
Last-Modified: Sat, 22 Jun 2019 10:38:33 GMT # Does this means up-to-date?
ETag: "5a4fdd26b49d1e579335dde414012297"
x-amz-meta-alexa-last-modified: 20190622103832
Accept-Ranges: bytes
Content-Type: application/zip
Content-Length: 96 |
This comment has been minimized.
This comment has been minimized.
No, it can be cache response header |
This comment has been minimized.
This comment has been minimized.
http://s3.amazonaws.com/alexa-static/top-1m.csv.zip |
This comment has been minimized.
This comment has been minimized.
Ah thanks @garrett-leyenaar, that's good to know it's still being updated. |
This comment has been minimized.
This comment has been minimized.
Interesting that today (2019-10-22) I re-ran my steps from https://gist.github.com/chilts/7229605#gistcomment-2880207 and noticed that I only get entries 1 to 647605 entries printed out. So I downloaded the
|
This comment has been minimized.
This comment has been minimized.
www.ktservis.com.tr used to mirror this file but I think they removed it because of copyright issues.Any other mirrors ? |
This comment has been minimized.
This comment has been minimized.
No need for a mirror, the file is still available using the URL from the script: http://s3.amazonaws.com/alexa-static/top-1m.csv.zip |
This comment has been minimized.
This comment has been minimized.
thanks
rustypangolin <notifications@github.com>, 2 Ara 2019 Pzt, 17:07 tarihinde
şunu yazdı:
… www.ktservis.com.tr used to mirror this file but I think they removed it
because of copyright issues.Any other mirrors ?
No need for a mirror, the file is still available using the URL from the
script: http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<https://gist.github.com/7229605?email_source=notifications&email_token=ANUETKOABDJ7FTKRKIL23PLQWUQDJA5CNFSM4HP2U5X2YY3PNVWWK3TUL52HS4DFVNDWS43UINXW23LFNZ2KUY3PNVWWK3TUL5UWJTQAF5DPA#gistcomment-3098352>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANUETKJBZT4MBKUXQ42YW6DQWUQDJANCNFSM4HP2U5XQ>
.
|
This comment has been minimized.
This comment has been minimized.
The alexa zip file contains only 839000 entries 1m != 839000 |
This comment has been minimized.
This comment has been minimized.
As of today, the Alexa "one million" contains 547855 entries. Very strange. |
This comment has been minimized.
This comment has been minimized.
Today is 763k. Last summer it started being short of "one million". I am here again trying to figure out why. We used Alexa in the past, still can't find anything on why it so short of 1 million. Good paper on T1M rankings pdf |
This comment has been minimized.
This comment has been minimized.
Alexa no longer provider that list for free. The price is: So for 1 million domains you'd pay 0.0025 * 1000000 = $2500 |
This comment has been minimized.
This comment has been minimized.
|
This comment has been minimized.
This comment has been minimized.
still work alexa at 2021 |
This comment has been minimized.
Plop this into a terminal to get the aforementioned nodejs script working on Ubuntu:
chilts: i'm diggin' it. thank you for sharing this.