Tuesday, May 17, 2011

Jsoup HTML Parser


There are so many open source java html parser. In this blog post just I will try to give some information about Jsoup. It is an open source Java HTML parser that I have been working on recently. Instead of Jsoup, you can use HTML parser, Jericho HTML parser or other parser libraries you want.

jsoup is a Java library for working with real-world HTML.

By using this library,

  • You can parse HTML from a URL, file, or string.
  • You can find and extract data, using DOM traversal or CSS selectors.
  • You can manipulate the HTML elements, attributes, and text.
  • You can clean user-submitted content against a safe white-list.

Getting Source Code.

Download the library and use it in your project. The current release version is 1.5.2.

Visit the example and start with jsoup to parse html.

If you use Maven to manage the dependencies in your Java project, you do not need to download; just place the following into your POM's section:

This post is an introduction to Jsoup. In other posts I will give some examples that I use in my real project. The examples will be about getting elements in html.

I wish to be useful.

Sunday, May 1, 2011

Java Primitive Data Types

In this part of my blog, I will try to give some information about java primitive data types, their range and default value of the types.

The Java programming language is statically-typed, which means that all variables must first be declared before they can be used. Before using any variable in your program, you must declare the variable with its type and name.

For example:

int data = 1;

This declaration tells your program that there is a field named “data”, holds numerical data, and has an initial value of "1".

Primitive data types and their range:

boolean :1 bit

range - May take on the values “true” and “false” only.

byte :1 byte

range - form -128 to 127

short :2 bytes

range – from -32,768 to 32,767

int :4 bytes

range – from -2,147,483,648 to 2,147,483,647

long :8 bytes

range – from -9,223,372,036,854,775,808

to 9,223,372,036,854,775,807

float :4 bytes

range – from 1.40129846432481707e-48

to 3.40282346638528860e+38

(positive or negative)

double :8 bytes

range – from 4.94065645841246544e-324d

to 1.79769313486231570e+308d

(positive or negative)

char :2 bytes, unsigned, unicode

range – from 0 to 65,535

String :a sequence of characters

We must know the range of the types to use them effectively in our programs. For example we have a counter, it starts with the value of 0 and it increases continuously. Initially for this counter as primitive type “int” will be sufficient. But in the future, the value of the counter will increase and it will be outside the range of int. so the program will not work correctly.

As developer if you do not assign a value to a variable you declare, it will be assigned will its default value.

Default values for the data types:

Data Type Default Value

byte :0

short :0

int :0

long :0L

float :0.0f

double :0.0d

char :’\u0000’

boolean :false

String :null

I wish to be useful.

Saturday, April 23, 2011

Java: For Loop

The “for loop” is a type of looping construct. This construct provides a compact way to iterate over a range of values. This loop works as “while loop construct” but it provides the initialization, condition and termination.

The general form (syntax) of the “for statement” can be expressed as follows:

for (initialization; termination; increment) {

statements;

//code_block_to_be_executed

}

The given example illustrates the syntax “for loop” for developing an application or a program.

initialization: The expression initialize the loop. It allows the variables to be initialized.

//it can be such as

int i = 0;

int j = 1;

termination (condition): This term of the for loop allows to check the certain condition. If the condition is true, the code block will be executed or the code block will be ignored.

//it can be such as

i < 10;

j <= 11;

increment: This term of the loop allows to increase the given variable. As programmers you can increment the value one by one, two by two or how you want.

//it can be such as

i++;

j++;

Generally the form of the “for loop” is illustrated above. Now I will give a code block and the output of the code. To be an example for the beginners it will be useful.

Code Block:

class ForLoop {

public static void main(String [] args) {

for (int i = 1; i< 4; i++) {

System.out.println(“Value of i is: ”, +i);

}

}

}

Output of the given code block will be:

Value of i is : 1

Value of i is : 2

Value of i is : 3

I hope it will be useful for programmers, beginners.