EN RU

Proprietary AI products as well as product development and R&D services for non-standard and science intensive tasks.

Contact Us

VK.com Data and Tools

Anonymized VKontakte dataset and data grabbing tools



Get in Touch

SEE MORE AND SUBSCRIBE:


We were developing the first version of the MaximaTelecom STATMAxima big data platform and a system for profiling users of Wi-Fi in the Moscow Metro. We had no data at all. We had no data to train on and no data to infer user traits from. Except for the raw BIND DNS logs.

We had to build our own training corpus in order to be able to predict user`s gender, age, interests, income level and so on. We also had to infer places of interest: locations where the person works, lives, dines and wines but that`s the other story.

We used publicly available VK.com user profiles as well as some other sources to build this training corpus.

On this page we are sharing datasets of public VKontakte user profiles as well as some tools we wrote to acquire the data.


Update 06 November 2017

Upon a request we give access to the data collected in 2014. The database format is SQLite. It contains 208 130 605 records. 206,6 GB on disk or 24,2 GB compressed.

Download the data sample - 1000 records. The collected data was public, but we removed the names, photos and telephone numbers to prevent misuse.

To receive the DB (it`s free), send us a request . Please, describe the project you are working on. If you request comes from academic email address you will receive the DB with all the fields present (as in original public data).

Data sample.


We continue to prepare more stuff for you to download (2018 database of public VK profiles and tools). Meanwhile, please send us a message if you want to get informed when it`s done (or just subscribe to one of our accounts):

Contact Us
info@ocutri.com
Or just type here and click 'Send':


Professional R&D