Pages

Wednesday, December 12, 2012

Crawling information about Android Apps through Android Marketplace Crawler

     
Note: On my machine, following crawler was able to crawl only around 50 Apps. So, if you know a little bit of programming, you can use this link to download multiple APKs from Googe Play.

1) Open this link . Click on 'create a clone' , fill out information and you would be able to see your created clone link as mine one.

2) Click on Download zip  (this is my clone code) button on your web page and you will get all the updated source code for Marketplace.  I would recommend you to download from the your own clone as it would be updated code and also, my clone code might be unstable due to changes.

3) Unzip this source code and you will have two folders: crawler and server.

4) Now as per requirements given on Android-Marketplace-Crawler page, we need to install 'ant'. 
Here is the installation instructions for 'ant'.

5) Java:

Since you want to work on Crawler, I assume you already have installed Android SDK and Java on your machine. I am not sure if you really need Android SDK installed on your system for Crawler project. However, I had it on my PC before working on this project.

6) As given on Crawler Home page, you need a valid Google Account to run the crawler. Open and modify the constructor, getters and getUsers() method in the file Secure.java found in hg/crawler/src/com/marketplace/io/Secure.java. 

I just modified credentials in following method on my machine.

public Secure() {
   preferences.put("username_key","<yourEmail>@gmail.com");                  

   preferences.put("password_key", "<Password>");
 }



7) It really took me a few hours to figure out how to build crawler project using 'ant' and what changes I need to do in this project before making a build.

Anyway, on my computer, path for 'crawler' folder is 'E:\tempCrawler\crawler'. Inside this folder, there is one file called 'build.xml'. Open this file and find this tag

<target name="dist" depends="compile" description="generate the distribution"> 

Under this tag, you will find one tag

<fileset dir="/Users/raunak/Development/android-marketplace-crawler/crawler/bin"/>

- Change this tag as below:

 (You will change it according to your folder path. My folder path is ''E:\tempCrawler\crawler'' and note the forward slash in the path)

<fileset dir="/tempCrawler/crawler"/>

8) Now open a command window by typing 'cmd' in the 'Start' search bar. Change your current directory to ''E:\tempCrawler\crawler'' so that command window looks like:

E:\tempCrawler\crawler>

9) Type 'ant all' without quotes on command window. It will create a few folders as given in build.xml file. Important one is "E:\Crawler\crawler\dist\lib" where ant will put 'crawler.jar' file.

If you can find this 'crawler.jar' file, that means, you have successfully built this project.

10) Now copy 'permission' file given in 'crawler' folder to 'E:\tempCrawler\crawler\dist\lib'.

Now in the 'lib' folder, you will have 'crawler.jar' and 'permission' files.

11) Open 'permission' file with your favorite text editor (I use Notepad++) and delete empty lines given at the end. Make sure there is no empty line at the end of the file.

If you don't do this, it will throw some exception and you will not be able to crawl apps from Android market.

12) Here comes the 3rd installation requirement: evil 'Ruby on Rails 3' which took my 30 hours to figure out the correct versions of Ruby and Rails for successful installation. And this is the real reason behind writing this manual.

 Download Ruby 1.9.2 from http://rubyinstaller.org/downloads/ and install it.

In my computer, it is at  "C:\Ruby192". You can install anywhere in C:\ but remember, there should not be any space in any folder name in the path.

-Type 'ruby -v' on command window and you will get output like:

ruby 1.9.2p290 (2011-07-09) [i386-mingw32]

13) Download DevKit from 'https://github.com/oneclick/rubyinstaller/downloads/'.

If you click on 'Download DevKit' , you will be able to download the setup, However, if that link is broken, download this version 'DevKit-tdm-32-4.5.2-20110712-1620-sfx.exe'.
 
During installation, it will ask where to Extract. I extracted it at 'C:\Ruby192\Devkit'. It may work in  other locations also, but I didn't check.

Post-installation instructions are given at this link. Follow it carefully.

14) After this, I am not sure what are the exact steps I followed. I did a few random installations whatever were suggested on internet, desperately wanting it to be working. However, I would try my best to mention everything required for successful installation.


I didn't have git installation on my computer, so I removed a few things in files related to git. I am not sure, if you have already installed git on your computer, it would work as it is or not.

Now go to 'server' folder at 'E:\tempCrawler\'. Open 'Gemfile.lock' file and remove first few lines so that your first two lines look like:

GEM
  remote: http://rubygems.org/

15) Open file 'Gemfile'  in server folder and find this line

gem 'vestal_versions' , :git => 'git://github.com/adamcooper/vestal_versions'

Comment the git part so that it looks like:

gem 'vestal_versions' #, :git => 'git://github.com/adamcooper/vestal_versions'


16) Add one line in 'Gemfile' file

gem "rake", "0.8.7"

17) On the command window, type 'gem install rails -v=3.0.3' as below. Note that you are in ‘server’ folder on command prompt


E:\tempCrawler\server>gem install rails -v=3.0.3

I only got 1 gem installed as I have already installed all gems, you may get more than one.

Type 'rails  --version' and it should show 'Rails 3.0.3'.


18) I forgot to mention about MySql server installation. Install mysql-5.5.28-winx64.msi .

I assume you didn't provide any password for 'root' otherwise you should note down password for root.

19) Now type 'bundle install' on the command windows. Make sure you are still in the server folder on the window.

E:\tempCrawler\server>bundle install

After successfully completing above steps, it should look like


20) Now type 'rake db:migrate' on the command window. Make sure you are still in the server folder on the window.

Since I already had created database instances given in the file 'E:\tempCrawler\server\config\database.yml' through this command, I was not able to execute this command successfully for this guide.

Note that if you provided password for 'root', you should mention password in database.xml file otherwise leave it as it.

I assume, it worked successfully on your machine otherwise, StackOverflow is best place to ask questions.

21) Now type 'http://localhost:3000/' in your browser and it should not show any error at least. Either it should show about Ruby or it should some thing like below. If you see the later one, then Congratulations.

Listing apps

Title Creator Category Price


22) Now if you want to crawl apps via category, keep your server running. Open a new command window by typing 'cmd'.

Remember, once upon a time, we built 'crawler.jar' in 'E:\tempCrawler\crawler\dist\lib' folder. On command window, go to 'lib' folder and type:

java -jar crawler.jar -c

You can type Ctr+C after some time if you don't want to crawl all the apps. In the browser, hit the same URL again and you would be able to see apps information on the page.

Android-Marketplace-Crawler is best place to know what can you do with this installation.

Good Luck =).

If you like my post and want to say thanks, you can get me a coffee. =)

15 comments:

  1. Hi.

    Thank you for your post!

    I have a problem.
    When I run "java -jar crawler.jar -c", crawler has got only some apps(42 apps). Is there another options I need to change?

    Thanks.

    ReplyDelete
  2. Hmmm..I had the same problem on my machine when I used this crawler. I thought, may be I have some problems with my configurations. So, after this, I moved on to GooglePlay-Crawler API and if you want to download all APKs or crawl information from Google Play, I would recommend you to follow this post (http://mohsin-junaid.blogspot.com/2012/12/how-to-download-multiple-android-apks.html).

    ReplyDelete
  3. These Android operating systems have become part of our regular lives. With Android apps, not only can we find a place easily, but we also know weather information.
    mobile application development services

    ReplyDelete
  4. Thanks for sharing such amazing content which is very helpful for us. Please keep sharing like this. Also check for Android App Development or many more.

    ReplyDelete
  5. Congratulations on your article, it was very helpful and successful. 429f64bfbbbda847c1e37817bb789523
    website kurma
    sms onay
    numara onay

    ReplyDelete
  6. Thank you for your explanation, very good content. c7695acdedb5edcf520b37d56aa990e1
    define dedektörü

    ReplyDelete