Sunday, March 1, 2015

Android App development on a Mac

I started with Android development yesterday, and getting to run the Android App on my phone took some struggle. I hope that this post helps the naive android developers to get started with building their first App.
OSx version : Yosemite 10.10.1

For developing an Android App, you need the following to be installed.

Install Java


First check whether Java is installed or not. On your terminal, type
$java -version
If you have Java 6.0 or above, you can skip to setting JAVA_HOME, else please download JAVA SDK from Java install page

Now set JAVA_HOME in your ~/.bash_profile to following:
export JAVA_HOME=$(/usr/libexec/java_home)
export JDK_HOME=$(/usr/libexec/java_home)

On the terminal window, type the following command:
$ source ~/.bash_profile

Install Android Studio


Get started by downloading Android Developer Studio. You can get this from the Android Developers Site.
After setting up Android Studio as per the instruction given on the site, you can get started by developing your first Android project.

Setting up USB debugging on your Android device


Plug in your device to your Mac and then enable USB debugging on your phone. This can be found in Developer Options on your android phone. Note that, the developer options is hidden by default on Android devices with version 4.2 and above.

To find the secret menu, click on settings App. Scroll down to About Phone.

Then go down to Build Number and tap on that 7 times.

Then when you go back to settings, you will see the developer options menu appear.

Enable USB debugging on device.

Getting you Mac to detect your Android device


Ideally, your Mac should detect your phone by default. However, for some reason, this may not happen. Follow the following steps to make your Mac detect your phone.

1. Make sure that adb(Android Developer Bridge) is included in your path. The default location for adb is:
~/Library/Android/sdk/platform-tools/

2. Include the following line in your .bash_profile.
export PATH=/usr/local/sbin:$PATH:~/Library/Android/sdk/platform-tools/

3. On the terminal window, type the following command:
$ source ~/.bash_profile

4. Next add the Vendor ID to ~/.android/adb_usb.ini. But first you have to find the Vendor ID value. You can do this from the System Information Application. Fortunately on Mac this is pretty easy. Launch the System Information application.
From the Hardware Menu in the left pane -> Select USB -> In the right pane, you will see the list of USB devices, select your phone from it -> In the lower pane on the right, you will see the vendor id. Copy this vendor id to the file ~/.android/adb_usb.ini

5. Restart adb
$ adb kill-server
$ adb devices

The device should be listed.

Thats it, you are all set for developing android apps and testing them on your mobile

Thursday, January 1, 2015

Gearing up for Learning Scala

Scala, an acronym for Scalable language is defined by Wikipedia as an object-functional programming language. This post will be an introduction to setting up your environment for scala development and writing a hello world program in scala. We will deep-dive into the concepts of scala in later posts.

Installation


1. You should have JDK6 or JDK7 installed.
2. For installing scala build tool (sbt) on OSx, you can use the following steps
$brew update
$brew install sbt
3. Download the scala IDE for Eclipse. After downloading the archive for your operating system, unpack it and start eclipse.

Building your Scala Hello World program


1. Go to File --> New --> Scala Project.

2. Choose a Project name and Select Finish.

3. Select File --> New --> Scala Object to create a new object.

4. Enter Hello as the name of the object and greeter as the package name.

5. Change the source code as below:
package greeter

object Hello extends App{
  println("Hello, World!")
}

6. Save the file and select Run - Run from the menu. Chose to run as Scala Application.

7. You can see the output in the console:



Tuesday, December 23, 2014

Configuring Hive on Ubuntu


Hive facilitates querying and managing large datasets residing in distributed storage. It is built on top of Hadoop. Hive defines a simple query language called as Hive Query language (HQL) which enables users familiar with SQL to query the data. Hive converts your HQL (Hive Query Language) queries into a series of MapReduce jobs for execution on a Hadoop cluster. In this post we will configure Hive on our machine.

Download Hive from the Apache Hive site. Unpack the .tar to the location of your choice and assign ownership to the user setting up Hive. At the time of this writing, the latest version available is 0.14.0.

Prerequisites:
Java: 1.6 or higher. Preferred version would be 1.7
Hadoop: 2.x. For Hadoop installation you can refer to this post.

Installation

Set the environment variable HIVE_HOME to point to the installation directory. You can set this in your .bashrc
export HIVE_HOME=/user/hive

Finally, add $HIVE_HOME/bin to your PATH.
$export PATH=$HIVE_HOME/bin:$PATH

Setting HADOOP_PATH in HIVE config.sh
Append the following line to the file $HIVE_HOME/bin/config.sh.
export HADOOP_HOME=/user/hadoop


Running Hive
You must create /tmp and /user/hive/warehouse and set appropriate permissions before you can create any table in hive.
$ hadoop fs -mkdir /usr/hive/warehouse
$ hadoop fs -chmod g+w /usr/hive/warehouse
$ hadoop fs -mkdir /tmp
$ hadoop fs -chmod g+w /tmp

Start the hive shell
$ hive

The shell would look something like
Logging initialized using configuration in jar:file:/user/hive/lib/hive-common-0.14.0.jar!/hive-log4j.properties
hive >

Reference : https://cwiki.apache.org/confluence/display/Hive/Home

Tuesday, December 16, 2014

Configuring Hadoop on Ubuntu in pseudo-distributed mode


Hadoop is an open-source Apache project that enables processing of extremely large datasets in a distributed computing environment. There are three different modes in which it can be run:

1. Standalone Mode
2. Pseudo-Distributed Mode
3. Fully-Distributed Mode

This post covers setting up of Hadoop 2.5.1 in a Pseudo-distributed mode on an Ubuntu machine. For setting up hadoop on OSx, refer to this post .

Prerequisites


Java: Install Java if it isn’t installed on your system.
Keyless SSH : First, ensure ssh is installed. Then generate the key pairs.
$sudo apt-get install ssh
$ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Now ssh into your localhost and allow authorization.
rsync utility:
$sudo apt-get install rsync

Installation


Download Hadoop from the Apache Hadoop site. Unpack the .tar to the location of your choice and assign ownership to the user setting up Hadoop. At the time of this writing, the latest version available is 2.5.2.

Configuration


Every component of Hadoop is configured using an XML file specifically located in hadoop-2.5.2/etc/hadoop.MapReduce properties go in mapred-site.xml, HDFS properties in hdfs-site.xml and common properties in core-site.xml. The general Hadoop environment properties are found in hadoop-env.sh.

hadoop-env.sh
# set to the root of your Java installation
export JAVA_HOME=/usr

# Assuming your installation directory is /user/hadoop
export HADOOP_PREFIX=/user/hadoop
For the rest of this post, we refer to /user/hadoop when we say $HADOOP_HOME.

core-site.xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

hdfs-site.xml

The Hadoop Distributed File System properties go in this config file. Since we are only setting up one node, we set the value of dfs.replication to 1.
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>


Execution


Before starting the daemons we must format the newly installed HDFS.
$ cd $HADOOP_HOME
$ bin/hdfs namenode -format

Start the Daemons:
$ cd $HADOOP_HOME
$ sbin/start-dfs.sh

Monitoring
By default, the web interface for NameNode is available at http://localhost:50070

Check the output of jps
$jps
10582 SecondaryNameNode
10260 NameNode
10685 Jps
10404 DataNode

Running Examples
1. Create the HDFS directories required to execute MapReduce jobs:
$ cd $HADOOP_HOME
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username>

2. Copy the input files to the Hadoop Distributed File System
$ bin/hdfs dfs -put etc/hadoop input

3. Run the example provided
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar grep input output 'dfs[a-z.]+'

4. View the output files on HDFS
$ bin/hdfs dfs -cat output/*

Stop the Daemons:
$ cd $HADOOP_HOME
$ sbin/stop-dfs.sh

Reference : http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation

Friday, December 12, 2014

Git Basics - A cheat sheet for your daily git needs.

This post is for anyone to refer to for their daily git needs. We will not be covering any advanced git concepts here.


Git is a distributed version control system.
Some basic terminologies:
Directory: A folder that contains multiple files.
Repository: A directory where Git has been initialized to start version controlling your files.

I have created an empty directory called gitBasics on my machine.
$ ls -a
.  ..

Let us initialize an empty git repository.
$ git init
Initialized empty Git repository in /Users/anjana/gitBasics/.git/
$ ls -a
.    ..   .git
As seen above, a hidden .git directory is created inside the the gitBasics, indicating that a repository has been initialized.

Next, lets see the current status of the directory as compared to the repository.
$ git status
On branch master

Initial commit

nothing to commit (create/copy files and use "git add" to track)

Now lets create a file filename.txt in the directory.
$ ls -a
.            ..           .git         filename.txt

Lets check the status again.
$ git status
On branch master

Initial commit

Untracked files:
  (use "git add ..." to include in what will be committed)

 filename.txt

nothing added to commit but untracked files present (use "git add" to track)
git shows that an untracked file is present.

Lets add this file to the staged area.
$ git add filename.txt
$ git status
On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached ..." to unstage)

 new file:   filename.txt

Next, lets commit these changes.
$ git commit -m"Adding test file"
[master (root-commit) 4b8b52d] Adding test file
 Committer: Shankar 
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly:

    git config --global user.name "Your Name"
    git config --global user.email you@example.com

After doing this, you may fix the identity used for this commit with:

    git commit --amend --reset-author

 1 file changed, 1 insertion(+)
 create mode 100644 filename.txt

At the time of commit, git tries to identify the author of the commit. In order to set this, use the following commands.
$ git config --global user.name "Anjana Shankar"
$ git config --global user.email "***@g***.com"

Now when you run git status, it says that the working directory is clean and there is nothing to commit.
$ git status
On branch master
nothing to commit, working directory clean

Next we have the git log command. This command prints the history of the repository.
$ git log
commit 4b8b52d4071a04c7f98436aae959ab9b10fec2ec
Author: Shankar 
Date:   Thu Dec 11 22:24:28 2014 +0530

    Adding test file

Now lets add the remote origin to our local repo.
$ git remote add origin git@github.com:*****/gitBasics.git

After the remote branch is added, we should push our code to remote git repo. This can be done as follows:
$git push -u origin master

In order to pull from remote branch, use the following command:
$git pull -u origin master

In order to see the differences between the current and the last committed version of code, use the following:
$ git diff HEAD
diff --git a/filename.txt b/filename.txt
index c9e358c..411cdda 100644
--- a/filename.txt
+++ b/filename.txt
@@ -1 +1 @@
-First File
+First File Modified

or you can simply use
$ git diff
diff --git a/filename.txt b/filename.txt
index c9e358c..411cdda 100644
--- a/filename.txt
+++ b/filename.txt
@@ -1 +1 @@
-First File
+First File Modified

A line prepended with '-' shows the deleted lines and a line prepended with '+' shows the added lines.

When we use the git add command, we stage the differences. Lets stage the differences first, and then understand how to unstage and reverse our changes to arrive at the last committed snapshot. I have created another file 'filename2.txt', Committed the file and then made some changes to it.
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add ..." to update what will be committed)
  (use "git checkout -- ..." to discard changes in working directory)

 modified:   filename2.txt

no changes added to commit (use "git add" and/or "git commit -a")
$ git add filename2.txt 
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes to be committed:
  (use "git reset HEAD ..." to unstage)

 modified:   filename2.txt

To see the staged differences, use the following:
$ git diff --staged
diff --git a/filename2.txt b/filename2.txt
index f686acc..5701cbe 100644
--- a/filename2.txt
+++ b/filename2.txt
@@ -1 +1 @@
-Second File
+Second File Modified

You can unstage the files as follows:
$ git reset filename2.txt
Unstaged changes after reset:
M filename2.txt
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add ..." to update what will be committed)
  (use "git checkout -- ..." to discard changes in working directory)

 modified:   filename2.txt

no changes added to commit (use "git add" and/or "git commit -a")

After unstaging the changes can be undone as follows:
$ git checkout -- filename2.txt
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean

Let's talk about branches now:
To create a new branch, use the following:
$ git branch newBranch

Use the following to switch branches
$ git checkout newBranch
Switched to branch 'newBranch'
$ git status
On branch newBranch
nothing to commit, working directory clean

I have modified 'filename2.txt' and pushed changes to this branch.
$ git log
commit 889ab1f0f42e7efd5818f68b30a42ced587db320
Author: Anjana Shankar <*****@gmail.com>
Date:   Fri Dec 12 10:14:05 2014 +0530

    Modified file on the branch

Now lets merge this branch to master. First we will have to switch back to master. Once you are on the master you can merge the branch.
$git checkout master
$ git checkout master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
$ git merge newBranch
Updating 7c4f3ad..889ab1f
Fast-forward
 filename2.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
$ git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)
nothing to commit, working directory clean
$git push
Total 0 (delta 0), reused 0 (delta 0)
To git@github.com:****/gitBasics.git
   7c4f3ad..889ab1f  master -> master

Finally as we are done with the branch, let's delete it.
$ git branch -d newBranch
Deleted branch newBranch (was 889ab1f).
$ git push origin --delete newBranch
To git@github.*****/gitBasics.git
 - [deleted]         newBranch

To see the remote branches available, use the following:
$ git branch -r
  origin/master

That's it in this post. Will try to cover a few advanced git concepts in my next posts.
Reference : Pro Git book

Friday, November 7, 2014

Configuring Hadoop on Mac OSx in pseudo-distributed cluster mode.


Hadoop is an open-source Apache project that enables processing of extremely large datasets in a distributed computing environment. There are three different modes in which it can be run:

1. Standalone Mode
2. Pseudo-Distributed Mode
3. Fully-Distributed Mode

This post covers setting up of Hadoop 2.5.1 in a Pseudo-distributed mode. A Pseudo-Distributed mode is one where each hadoop daemon runs as a separate java process.

Prerequisites


Java: Install Java if it isn’t installed on your mac.
Homebrew: Homebrew is a package manager for Mac. You can find the installation instructions here
Keyless SSH : First, ensure Remote Login under System Preferences -> Sharing is checked to enable SSH. Generate the key pairs.
$ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Now ssh into your localhost and allow authorization.

Installation


This is where Homebrew is used.
$brew install Hadoop
If you do not want to use homebrew or you want to install a specific version of Hadoop, you can download it from the Apache Hadoop. Unpack the .tar to the location of your choice and assign ownership to the user setting up Hadoop.

Configuration


Every component of Hadoop is configured using an XML file specifically located in /usr/local/Cellar/hadoop/2.5.1/libexec/etc/hadoop.MapReduce properties go in mapred-site.xml, HDFS properties in hdfs-site.xml and common properties in core-site.xml. The general Hadoop environment properties are found in hadoop-env.sh.

hadoop-env.sh

Replace the existing HADOOP_OPTS with following.
export HADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
If Homebrew was not used to install Hadoop, kindly point the JAVA_HOME to your java installation.

core-site.xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

hdfs-site.xml

The Hadoop Distributed File System properties go in this config file. Since we are only setting up one node, we set the value of dfs.replication to 1.
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>


Execution

Before starting the daemons we must format the newly installed HDFS.
$ cd /usr/local/Cellar/hadoop/2.5.1/libexec/bin
$ hdfs namenode -format

Start the Daemons:
$ cd /usr/local/Cellar/hadoop/2.5.1/libexec/sbin
$ ./start-dfs.sh

Monitoring
Check the output of jps
$jps
10756 NameNode
1282 
10842 DataNode
11022 Jps
10951 SecondaryNameNode
1842 

Alternatively, the web interface for the NameNode can be browsed at http://localhost:50070

Running Examples
1. Create the HDFS directories required to execute MapReduce jobs:
$ cd /usr/local/Cellar/hadoop/2.5.1/libexec/bin
$ hdfs dfs -mkdir /user
$ hdfs dfs -mkdir /user/<username>

2. Copy the input files to the Hadoop Distributed File System
$ hdfs dfs -put ../etc/hadoop input

3. Run the example provided
$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'

4. View the output files on HDFS
$ hdfs dfs -cat output/*

Reference : http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation

Wednesday, October 8, 2014

Setting up keyboard shortcuts in Mac OSX

How many of you have wished that you could maximize your window size without having to drag your mouse around to the green button on the title bar. You will have to make your own keyboard shortcut for this one, since it isn’t set by default. This post aims to show you how this can be done.
Go to System Preferences -> Keyboard.
Choose the Shortcuts tab.
From the Left Pane select App Shortcuts. Click on + icon at the bottom of the right pane.

Enter Zoom into the menu title section and then click into the keyboard shortcut box to define your keyboard shortcut.



Make sure to enter the exact command of the menu title that you want to add the shortcut for.