Monday, June 10, 2019

Removing boilerplate from webpages using python

I am an avid Firefox web browser user and more often that not, when I am visiting a webpage containing an article, I end up clicking on the "Reader Mode" button in the address bar so that I can remove all the useless noise and just focus on the main content that the page has to offer.

A news article, like this one shown below

becomes like this after entering the reader mode.
As you can see, all the ads, boilerplate etc is gone and now I can focus on the actual content without straining my eyes to find the stuff that I visited the webpage for.


It turns out that one could easily write a python program ( python being my language of choice for such quick experiments ) to do the same thing.

We will use the python readability library to achieve this thing in our python program.

 Here is the program. Seems to work for me.



Friday, March 29, 2019

Using Regular Expressions in Go

Go language provides package regexp for making use of regular expressions in Go programming. The syntax of the regular expressions is almost identical to that supported by other languages like Perl, Python etc.

Let's start with the basics. We will try to use regexp package to find out if a given text contains any alphanumeric words. By alphanumeric words, we mean any word that is comprised of letters or numerals.

To do this, we will compile a regexp object out of a regular expression. Then we use this regexp object on input strings to check if they contain any alphanumeric words or not, and if they do, we print each one of them.

Here is a program for doing this.




Tuesday, March 5, 2019

Writing the simplest HTTP server in Go

To write a very simple, HTTP server in Go, that serves just a single static page, you just need to do the following.

  • Import the net/http package
  • start listening on a port of your choice using http's ListenAndServe
  • server requests using a HTTP ResponseWriter.

Here is the code for this:-
package main
import (
"net/http"
"log"
)

func viewHandler(w http.ResponseWriter, r *http.Request) {
    w.Write([]byte("Hello there..."))
}

func main() {
    http.HandleFunc("/", viewHandler)
    log.Fatal(http.ListenAndServe(":8087",nil))
}
Using http.HandleFunc we choose to handle incoming requests on '/' path using the viewHandler function. After this, we start a ListenAndServe() function to start handling incoming HTTP requests on 8087.

Go ahead and build this and then run the exe. Your server has started. Now, open the browser and goto http://localhost:8087/

You should see a 'hello there...'.

Go has a pretty powerful HTTP library that allows you to do a lot of things. We shall explore more in the future.


Friday, February 22, 2019

File Handling in Go - Part 2

In the Part 1 of this lesson, we learned how to open, read and write to files using Go Language. In this lesson, we shall learn about other file handling related functions provided by Go language.

Seeking

We often need to seek into a file to a particular point and then do read or write. For there is a seek() function provided by os package.


func (f *File) Seek(offset int64, whence int) (ret int64, err error)

 Seek() function takes two arguments, offset and whence. whence tells the function from where to seek and offset tells how much to seek. 
whence can have one of the following values:
  • 0 - Seek From beginning
  • 1 - Seek from current position in file
  • 2 - Seek from end
So, e.g.

Seek(0,0) is going to seek to the start of the file.
Seek(0,2) is going to seek to the end of the file.
Seek(10,0) is going to seek 10 bytes from the start of the file.

Let's look at a program that uses the Seek() function.


package main
import (
    "fmt"
    "os"
)

func main() {
    f,err := os.Open("file-1.txt")

    if err != nil {
        fmt.Println("Error in Opening file",err)
        os.Exit(1)
    }

    //Seek 10 bytes from the start of file
    offset,err := f.Seek(10,0);

    if err != nil {
        fmt.Println("Error in Seeking",err)
        os.Exit(1)
    }

    // Make a buffer to read data from the file
    buf := make([]byte,10)

    //Read some bytes from the file
    readbytes, err := f.Read(buf)

    if err != nil {
        fmt.Println("Error in reading from file",err)
        os.Exit(1)
    }

    //Print the data read from file
    fmt.Println("Data read from file from offset",offset,readbytes)
    fmt.Printf("%s\n",buf)


    //Seek to the start of the file
    offset,err = f.Seek(0,0)

    if err != nil {
        fmt.Println("Error in seeking to front",err)
        os.Exit(1)
    }

    // Read some bytes from the start of the file
    readbytes,err = f.Read(buf)

    if err != nil {
        fmt.Println("Error in reading from file",err)
        os.Exit(1)
    }

    fmt.Println("Data read from file from offset",offset,"Total bytes read",readbytes);
    fmt.Printf("%s\n",buf)


    //Seek to the end of file
    offset, err = f.Seek(0,2)

    if err != nil {
        fmt.Println("Error in seeking to end",err)
        os.Exit(1)
    }

    fmt.Println("File-Offset",offset)
}

So, I created a file called file-1.txt and ran the above program. Notice that, at the end of the program, we seek to the end of the file. If you attempt to read further from this point on, you should get a EOF error.

The one thing, that you need to remember while using the Seek() function is that the behavior of Seek() is unspecified if the file has been opened in APPEND mode. So you should avoid using Seek() when file is opened in APPEND mode.


File Stat - Getting to know about a file


The stat() function provides us with useful information about a file that already exists in your system.

func (f *File) Stat() (FileInfo, error)

Stat() function returns a FileInfo object that contains all the information about the file. 


FileInfo provides relevant information about the file.


type FileInfo interface {
 Name() string       // base name of the file
 Size() int64        // size of file
 Mode() FileMode     // file mode bits
 ModTime() time.Time // modification time
 IsDir() bool        // Is file a directory
 Sys() interface{}   // underlying data source (can return nil)
}


Here is a sample program, that shows the usage of stat() function.


package main

import (
    "fmt"
    "os"
)

func main() {
    f, err := os.Open("file-1.txt")

    if err != nil {
        fmt.Println("Err in Opening file",err)
        os.Exit(1)
    }

    finfo, err := f.Stat()

    if err != nil {
        fmt.Println("Err in file stat",err)
        os.Exit(1)
    }

    fmt.Println("Name of the file:", finfo.Name())
    fmt.Println("Size of the file:", finfo.Size(), "bytes")
    fmt.Println("Mode of the file:", finfo.Mode())
    fmt.Println("Is the file a directory:", finfo.IsDir())
}


Alternatively one could simple use the Stat() function by providing the name of file. In that case, you don't need to open the file for a file object.   

    finfo, err := os.Stat("File-1.txt") 


There are a whole lot of other functions provided by os package, that you can find in Golang OS package documentation. Check it out and use the functions as per your need.

Feel free to comment on what is missing or could be enhanced in this lesson. 

Monday, February 18, 2019

Books worth reading

Being an avid reader has its advantages. You get to delve into worlds that only exist in someone else's imagination. You learn a lot from experiences of those who have put them down on paper for everyone else to read.
When I was a kid, I remember that I mainly read classics, from Charles Dickens to Arthur Conan Doyle.
There was a time when I got really fascinated by murder mysteries and I read almost all of Agatha Christie's Hercule Poirot books.
As I grew up, I started reading non-fiction books as well. 
When I was down or lost or just needed some inspiration, I sought help from some of the great self-help books out there.
Starting on a job after college took  a toll on my reading, but I did not give up on it altogether.
Last five years or so, I have really been into science fiction books. Recently finished the Three Body Problem Trilogy by Cixin Liu, and now moving onto Anathem by Neal Stephenson.

As a side interest, I would start compiling a list of books that I have already read or plan to read. This list would be regularly updated on this page and from time to time, I may also write about the books that I read.

Here is the list.

Science Fiction Books
  1. Animal Farm
  2. Snow Crash 
  3. The three body problem
 Non Fiction Books
  1. Thinking Fast and Slow
  2. Meditations
  3. The Selfish Gene

Saturday, February 16, 2019

File Handling in Go - Part 1

Go language provides os package for using operating system services and that includes file handling.

Let's go through a step by step process and learn how to open files and read and write to them in Go Lang.

Opening a file

Go provides two function calls, Open() and OpenFile(), to open a file. Here is the function declaration of these two:-

func Open(name string) (*File, error)

func OpenFile(name string, flag int, perm FileMode) (*File, error) 

Open() takes the name of the file to open as an argument and returns a file handle ( or object ) and error if there is any.The thing we have to keep in mind is that when we open a file using Open() function call, it opens in a read-only mode. So you can only read from the file and not write to it. To be able to write to a file, we would have to use the OpenFile() function.

To start with let's use the Open() call to get a hang of it. Here's the program for it.

package main
import (
    "fmt"
    "os"
)

func main() {

    filehandle,err := os.Open("hello.txt")
    if err == nil {
        fmt.Println("File Opened successfully", filehandle)
    } else {
        fmt.Println("Error :", err)
    }
}

So, I created a file named hello.txt in the same directory as my Go program and then ran this program. In the above program, we import the package os that provides us with OS interface. We call the Open() function and check for error. If there is no error, we print the file handle, which is basically some number.

Reading from a file

So that was a pretty basic example. Now let's try to read something from the opened file. For that we will use the following function call provided by os package.

 func (f *File) Read(b []byte) (n int, err error)


Read() function takes as an input an array of bytes, b, into which it will fill whatever it reads from the file. It returns the number of bytes it read from the file, n, and error if there was any.

Let's enhance the above program to read something from the file, hello.txt.

package main
import (
    "fmt"
    "os"
)

func main() {

    filehandle,err := os.Open("hello.txt")
    b := make([]byte, 20)

    if err == nil {
        fmt.Println("File Opened successfully", filehandle)

        n, err := filehandle.Read(b)

        if err == nil {
            fmt.Printf("Read Success. N=[%d], Text=[%s]", n, b)
        } else {
            fmt.Println("Read Error:",err)
        }

    } else {
        fmt.Println("Error :", err)
    }
}


So we added a byte array, b, that is allocated a space of 20 bytes. We called Read() function using the file handle that we received when we opened the file using Open(), and then we printed whatever we read from the file in case there was no error.

Writing to a file

So that was pretty simple, I hope. Opening and Reading from a file in Go is quite straightforward. Now, let's explore how to write to a file.

Remember that Open() function opens a file in read-only mode. To be able to write to a file, we need to use OpenFile() function. First, let's understand how to use the OpenFile() function.

func OpenFile(name string, flag int, perm FileMode) (*File, error)

The first argument of the OpenFile() is the name of the file that we would like to open.
Next is an integer called flag. flag can take one of the following values.

const (
 // Exactly one of O_RDONLY, O_WRONLY, or O_RDWR must be specified.
 O_RDONLY int = syscall.O_RDONLY // open the file read-only.
 O_WRONLY int = syscall.O_WRONLY // open the file write-only.
 O_RDWR   int = syscall.O_RDWR   // open the file read-write.
 // The remaining values may be or'ed in to control behavior.
 O_APPEND int = syscall.O_APPEND //append to the file when writing.
 O_CREATE int = syscall.O_CREAT  //createnew file if none exists.
 O_EXCL   int = syscall.O_EXCL //used with O_CREATE,file must not exist.
 O_SYNC   int = syscall.O_SYNC   //open for synchronous I/O.
 O_TRUNC  int = syscall.O_TRUNC  //truncate file when opened.
)


So, if we would open a file for read-only purpose, the value of the flag will be O_RDONLY. If we open a file for only writing, then flag shall be O_WRONLY and so on.

As documentation above suggests, flag can only have either one of the following values O_RDONLY, O_WRONLY, O_RDWR. The rest of the values in the above block can be ORed with any of the first three, depending on the requirement.

For example,
If we wanted to open the file for Reading and writing, and we wanted that when we write to the file, the data should be appended to it, we would have the value of flag as
O_RDWR | O_APPEND. 
And if we wanted to want to open a file for writing and if the file does not exist we would like have it created, the value of the flag would be
O_WRONLY | O_CREATE
  

Next Argument, perm, is of type FileMode. This argument defines the kind of permissions that this file has and what kind of file it is. It is a 32 bit integer, where the least significant 9 bits, represent the standard unix permissions like rwxrwxrwx. So, while creating the file you can specify the kind of permissions that this file is going to be created with.

The Other bits in the FileMode have their own significance, but we will not cover them in this lesson.

So, now that we have understood each argument of the FileOpen() function, now let's write a short program, that will write some text to a file.

To write to a file, we shall use the Write() function.

func (f *File) Write(b []byte) (n int, err error)

The Write() function, takes a byte array as an argument and writes the contents of this byte array to the file. It returns the number of bytes written to file and error if there is any.

Here's the program that depicts usage of FileOpen() and Write().

package main
import (
    "fmt"
    "os"
)

func main() {

    f, err := os.OpenFile("gotest.txt",os.O_RDWR|os.O_CREATE,0777)

    if err == nil {

        fmt.Println("File Opened for writing successfully")

        b := []byte("Hello there")

        n,err := f.Write(b)

        if err == nil {
            fmt.Printf("Written %d bytes successfully",n)
        } else {
            fmt.Println("Error in Writing to file",err)
        }

    } else {

        fmt.Println("Error in Opening file", err)

    }
}

In the above program, we opened a file "gotest.txt" with flag O_RDWR | O_CREATE and file permission as 0777. We wrote a byte array containing the text "Hello there" to the file. We check for error as always before declaring success.

Well, that is it for this lesson. We shall see other File Handling related functions provided by os package in the next lesson.

Part 2 of this lesson is now available.

References:

Friday, February 15, 2019

User input in Go language

Taking a user input in Go language is easy. Let's go through a few examples on how to do this.

We shall use the function provided by the fmt package to take user input.

Reading an Integer from console

Look at the example program below. Here we ask the user to input an integer and we use the Scan function of the fmt package to store the input into a integer variable. Then we go on to print it's value to confirm that our program worked correctly.

package main

import "fmt"

func main() {
    var a int
    fmt.Printf("Enter an integer -> ")
    fmt.Scan(&a)
    fmt.Println("You entered:", a)
} 


Reading Text from console
 
Now, let's say that you need to read a text from the console. In that case we can use the function provided by the package bufio.

Here is an example:

package main

import (
    "fmt"
    "bufio"
    "os"
)

func main() {
    var mytext string
    var err error

    txtReader := bufio.NewReader(os.Stdin)

    fmt.Println("Enter some text ( Press Enter when done ) ")

    mytext, err = txtReader.ReadString('\n')

    if err == nil {
        fmt.Println("Here's what you wrote: \n", mytext)
    } else {
        fmt.Println("Error: ", err)
    }

}
As you can guess already, in the above example, we use the function NewReader of the package bufio, to return us a reader object that will help us read from the standard input i.e. the console.

We then use the ReadString function of the reader object and tell it that the input will terminate with a newline character.

We check for error and then we print the input text.

In the above example, the reader object returned by the function call bufio.NewReader()  has a read buffer of default size, which is 4096 bytes. If the text to input from console is larger than 4096 bytes, then we can use bufio.NewReaderSize() API to specify the number of bytes you desire to input.

Monday, February 4, 2019

Understanding Pointers - Part 2 - Usage and Pointer Arithmatic

In Part 1 of this lesson, we explored the definition of pointers. We also saw how to declare pointers of various types and how to use * operator to print the values stored at an address assigned to a pointer.

Having made ourselves familiar with basic pointer syntax, we can now go ahead and start looking into how pointers are used in day to day programming and while doing this we shall also learn another key concept related to pointers called Pointer Arithmatic.

We shall do this by going through a series of example programs. I would encourage you to type each example program in your favorite editor and compile and run the program.

This will make you more and more comfortable with pointers as we go forward.

Have a look at the program below:-

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int *ptr = NULL;
    
    ptr = (int *) malloc ( sizeof(int) * 10 );
    
    if ( NULL == ptr )
    {
        printf("Unable to allocate memory, exiting\n");
        exit(EXIT_FAILURE);
    }
    
    printf("Starting Address of allocated memory space => 0x%x\n",ptr);
    
    ptr = ptr + 1;
    
    printf("Address after jumping one ahead => 0x%x\n",ptr);
    
    ptr = ptr - 1;
    
    printf("Address after jumping one back => 0x%x\n",ptr);
    
    free(ptr);
    
    exit(EXIT_SUCCESS);
}

In the above program, we have done a lot of new things that you didn't encounter in the first lesson. Let's go through this program line by line and understand what's happening.

  1. We declare a integer pointer 'ptr' and assign it a value of NULL. In the first lesson we didn't assign any of our pointers to NULL value to start with. But we should have as it is considered a good practice.
  2. Then we use the dynamic memory allocation function, malloc(), to allocate a memory space of size of 10 integers. If malloc() is successful, it shall return us a pointer which shall be stored in the pointer variable 'ptr'. As expected, the value in ptr is a memory address. This memory address is the starting address of the memory space allocated to us. The total length of this allocated space is the size of 10 integers, which on most machines is going to be 40 bytes - 4 bytes for each integer.
  3. The next block of code, we do some error handling. We check if malloc() was successful in allocating memory or not. If malloc() fails for any reason, it shall return a NULL value. It is always recommended that we check the return value of malloc() to see if we have really been allocated any memory or not. If malloc() does return a NULL, we exit the program after printing an error message.
  4. Next we print the value of the ptr. This shall be the starting address of the block of memory allocated to us.
  5. Next, we add 1 to ptr. This is the interesting part and our first leap into the world of pointer arithmatic. Normally if we add 1 to an integer variable, we shall get a value that is incremented by one. e.g. if I have an integer that stores a value 2 and I add 1 to it, I will get the value 3. But with pointers something different happens. In our case, when we increment 'ptr' by one, it jumps to the next integer address in the memory space. So after executing this line of code, ptr shall now be storing the address taken up by the next integer in our allocated memory. This address shall be 4 more in value than the starting address, because as we have already discussed, an integer takes up 4 bytes of space. 
  6. Next we print ptr. This, as we understood, in the last point, shall be the address of next integer in the allocated memory. It's value shall be 4 more than the starting address of allocated memory.
  7. Next we subtract 1 from ptr. We saw that when we added 1 to ptr, it jumped one integer ahead. Similarliy, if we subtract 1 from ptr, it should jump 1 integer back. That's what happens. ptr will jump back 1 integer and and shall now again be pointing to the starting address of allocated memory, where it was pointing originally.
  8. Next we print the value of the ptr.
  9. Next, we call the free() function to release the allocated memory. This is again a very important step. You should release any dynamically allocated memory as soon as you have no more use for it.
  10. We exit the program.
Here is the output that I get when I run the above program:

Starting Address of allocated memory space => 0xa32a98

Address after jumping one ahead => 0xa32a9c

Address after jumping one back => 0xa32a98

Note, than after jumping the pointer one step ahead, we get an address that has a value 4 greater than the starting address. This indicates that an integer is allocated 4 bytes on my system.

In the last program, we dynamically allocated some memory from heap and showcased pointer increment and decrement operations. We can also use pointer arithmatic operations on memory declared on stack or some global or static memory. And this is actually how pointer arithmatic is most commonly used.

Let's check out another program where we use memory allocated on stack.

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int arr[5] = {1,2,3,4,5};
    int *ptr = NULL;

    ptr = &arr[0];
    printf("Value stored at Address [0x%x] => %d \n", ptr, *ptr);

    ptr++;

    printf("Value stored at Address [0x%x] => %d \n", ptr, *ptr);


    ptr++;

    printf("Value stored at Address [0x%x] => %d \n", ptr, *ptr);


    ptr++;

    printf("Value stored at Address [0x%x] => %d \n", ptr, *ptr);


    exit(EXIT_SUCCESS);
}

Here is the output that I get when I run the above program:-
Value stored at Address [0x61ff18] => 1
Value stored at Address [0x61ff1c] => 2
Value stored at Address [0x61ff20] => 3
Value stored at Address [0x61ff24] => 4

In the above program, we declare an array of 5 integers. We take a pointer to an integer at point it to the address of first element of the array.

Then, we keep incrementing the pointer by one and printing the address stored in the pointer and value stored at that address.

I hope you have got a little sense of how we can use pointers in our programs.

We shall elaborate on the uses of Pointer arithmatic more in the next part of this lesson.