Table of contents

Download

Download Julia from julialang.org and Julia IDEs from

Juno is a good IDE for writing and evaluating julia code quickly. IJulia notebook is good for writing tutorials and reports with julia code results embeded in the document.

Once you've installed everything I recommend opening up the Juno IDE and going through the tutorial.

Quick start

I execute all Julia code below in IJulia. I suggest you create a folder on your desktop and make it your working directory where we will be able to write files. First, a couple of basic commands. To evaluate code in Juno you just need to press Ctrl-D (its in the Juno tutrial):

VERSION # print julia version number
pwd() # print working directory
homedir() # print the default home directory
# set working directory to DirectoryPath "C:/Users/TimDz/Desktop/julia-lang"
cd("C:/Users/TimDz/Desktop/julia-lang") 
3+5 # => 8
5*7 # => 35
3^17 # => 129140163
3^(1+3im) # im stands for imaginary number => -2.964383781426573 - 0.46089998526262876im
log(7) # natural log of 7 => 1.9459101490553132

Interesting that julia has imaginary number built in. Now, variables and functions:

a = cos(pi) + im*sin(pi) # assigning to a variable
-1.0 + 1.2246467991473532e-16im
b = e^(im*pi)
-1.0 + 1.2246467991473532e-16im
a == b # boolean expression. It is an euler identity.
true

Lets see how to define functions. Here is a chapter on functions in julia docs for more info.

plus2(x) = x + 2 # a compact way

function plustwo(x) # traditional function definition
    return x+2
end
plustwo (generic function with 1 method)
plus2(11)
13
plustwo(11)
13

Here is a julia cheatsheet with above and additional information in a concise form. Next, lets write a function that will generate some data which we will write to a csv file, plot, and save the plot.

Data frames, plotting, and file Input/Output

So I decided to write a function $f(x)$ that performs the process from the Collatz conjecture. Basically, if x is even divide by $2$, if x is odd multiply by three and add $1$. Repeat the process until you reach one. The Collatz conjecture proposes that regardless of what number you start with you will always reach one. Here it is in explicit form $$ \ f(x) = \begin{cases} x/2, & \mbox{if } x\mbox{ is even} \\ 3x+1, & \mbox{if } x\mbox{ is odd} \end{cases} $$ The function collatz(x) will count the number of iterations it took for the starting number to reach $1$.

function collatz(x)
    # Given a number x
    # - divide by 2 if x is even
    # - multiply by 3 and add 1 if x is odd
    # until x reaches 1
    count = 0
    while x != 1
        if x % 2 == 0
            x = x/2
            count += 1
        else
            x = 3*x + 1
            count += 1
        end
    end
    return count
end

collatz(2)
1
collatz(3)
7

Data frames

Now, lets create a data frame with the number of steps needed to reach 1 for each number from 1 to 1000. We will use the DataFrames package because the base julia library does not have data frames.

# Pkg.add("DataFrames")
using DataFrames

# Before populating the dataframe with collatz data lets see how to create a dataframe
df = DataFrame(Col1 = 1:10, Col2 = ["a","b","c","d","e","f","a","b","c","d"])

# Lets use collatz data
df = DataFrame(Number = 1:1000, NumofSteps = map(collatz,1:1000))
head(df)
NumberNumofSteps
110
221
337
442
555
668

map() applies collatz() function to every number in the 1:1000 array which is an array of numbers [1,2,3,...,1000]. In this instance map() returns an array of numbers that went went through collatz() function.

# To get descriptive statistics 
describe(df)
Number
Min      1.0
1st Qu.  250.75
Median   500.5
Mean     500.5
3rd Qu.  750.25
Max      1000.0
NAs      0
NA%      0.0%

NumofSteps
Min      0.0
1st Qu.  26.0
Median   43.0
Mean     59.542
3rd Qu.  99.0
Max      178.0
NAs      0
NA%      0.0%

Before we save it lets categorize the points based on whether the original number is even or odd.

# create new evenodd column
df = hcat(df, map(x -> if x % 2 == 0 "even" else "odd" end, 1:1000)) 
rename!(df, :x1, :evenodd) #rename it to evenodd
head(df)
NumberNumofStepsevenodd
110odd
221even
337odd
442even
555odd
668even

hcat(df, column(s)) horizontally concatenates data frames. I use the map() function with an anonymous function x -> if x % 2 == 0 "even" else "odd" end which checks for divisibility by two to create a column with "even" and "odd" as entries. Finally, I rename the new column "evenodd". Lets save it:

# To save the data frame in the working directory (make sure to set the wd as described in 
# the beginning of the tutorial)
writetable("collatz.csv", df)

Plotting data

To plot the data we will use the Gadfly package. Cairo is needed to be able to save plots as PDFs and PNG. First, I will do a simple plot.

# Pkg.add("Gadfly")
# Pkg.add("Cairo")
# Pkg.add("Compose")
# Pkg.update()
using Gadfly, Cairo

plot(df,x="Number", y="NumofSteps", Geom.point)
Number -1500 -1000 -500 0 500 1000 1500 2000 2500 -1000 -950 -900 -850 -800 -750 -700 -650 -600 -550 -500 -450 -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1050 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 2000 -1000 0 1000 2000 -1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 -200 -190 -180 -170 -160 -150 -140 -130 -120 -110 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 -200 0 200 400 -200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 NumofSteps

Looks pretty. You should be able to zoom in/out. Lets color the points based on whether the original number is even or odd. I will assign the plot to a variable and save it.

# assign plot to variable
a = plot(df,x="Number", y="NumofSteps", color = "evenodd", Geom.point) 
Number -1500 -1000 -500 0 500 1000 1500 2000 2500 -1000 -950 -900 -850 -800 -750 -700 -650 -600 -550 -500 -450 -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1050 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 2000 -1000 0 1000 2000 -1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 odd even evenodd -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 -200 -190 -180 -170 -160 -150 -140 -130 -120 -110 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 -200 0 200 400 -200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 NumofSteps

It looks like odd numbers overlay the even numbers points. Let's plot even and odd numbers side by side.

a = plot(df[df[:evenodd] .== "odd",:],
         x="Number", y="NumofSteps", 
         Geom.point,
         Guide.xlabel("Odd"))

b = plot(df[df[:evenodd] .== "even",:],
         x="Number", y="NumofSteps", 
         Geom.point,  
Guide.xlabel("Even"),
         Theme(default_color=colorant"#d4ca59"))

vstack(a,b)
Even -1500 -1000 -500 0 500 1000 1500 2000 2500 -1000 -950 -900 -850 -800 -750 -700 -650 -600 -550 -500 -450 -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1050 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 2000 -1000 0 1000 2000 -1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 -200 -150 -100 -50 0 50 100 150 200 250 300 350 -150 -145 -140 -135 -130 -125 -120 -115 -110 -105 -100 -95 -90 -85 -80 -75 -70 -65 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 205 210 215 220 225 230 235 240 245 250 255 260 265 270 275 280 285 290 295 300 -200 0 200 400 -150 -140 -130 -120 -110 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 NumofSteps Odd -1500 -1000 -500 0 500 1000 1500 2000 2500 -1000 -950 -900 -850 -800 -750 -700 -650 -600 -550 -500 -450 -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1050 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 2000 -1000 0 1000 2000 -1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 -200 -190 -180 -170 -160 -150 -140 -130 -120 -110 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 -200 0 200 400 -200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 NumofSteps

Even numbers up to 1000 require fewer steps to reach one through the Collatz procedure than odd numbers. We can save this plot.

# Save the plot in the working directory
draw(PNG("collatz-plot.png", 8inch, 10inch), vstack(a,b))

Conclusion

Julia is a comfortable language to work with and many say it is the future of scientific computing. It may very well be true. One of the main reasons is Julia's JIT compiler which makes Julia almost as fast and sometimes faster than C. At this point, I find Julia not as good as R simply because R is more mature and has a bigger commmunity. R aslo has better documentation and more questions on Stackoverflow. There are $109419$ questions with an R tag in contrast to $1251$ questions with julia-lang tag as of 10/12/2015 and $1631$ as of 3/7/2016.

Julia is up and coming and given enough time it could create competition for R. Unlikely that Julia is going to be a competitor in the industry against Python, SAS and R, but in academia it is a different story.

Resources used