Java Jsoup Tutorial – Kensoft PH https://kensoftph.com Power up your knowledge in programming Thu, 16 Dec 2021 10:12:14 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://kensoftph.com/wp-content/uploads/2021/07/cropped-Kensoftph-New-logo-32x32.png Java Jsoup Tutorial – Kensoft PH https://kensoftph.com 32 32 How to use Java Jsoup Tutorial | 100% perfect for beginners https://kensoftph.com/how-to-use-java-jsoup-tutorial-for-beginners/ https://kensoftph.com/how-to-use-java-jsoup-tutorial-for-beginners/#comments Thu, 16 Dec 2021 09:20:09 +0000 https://kensoftph.com/?p=1816 Java Jsoup Tutorial

Java Jsoup tutorial is a powerful Java library for extracting and manipulating data from websites using HTML5 DOM methods and CSS selectors. This library is designed to scrape and parse HTML from a URL, but it can also find and extract data through DOM traversal or CSS selectors. It is also capable of manipulating HTML elements, attributes, and text. To avoid XSS attacks, it can also clean the content submitted by user.

Jsoup Example

You may use this Java Library to develop your own Java application to scrape data from a URL, and if you want to learn more Java programming tutorials, you can go here to check out other programming tutorials. In this Jsoup tutorial, I will demonstrate how to use this Library or API (Application Programming Interface).

Jsoup Tutorial Java

In this Jsoup Tutorial Java, You will learn how to download and integrate the Java library into your Java program. If you are using Java Ant, Maven, or Java Gradle, please continue reading to learn how to download the library. Let’s start with Java Ant and then go on to Maven and Gradle.

Jsoup Ant

If you are using Jsoup Ant, you will need to download the jar file (core library). After you have downloaded the jar file, you must place it in your project library or the global library. To include the jar file into your project, go to the Libraries section and add the jar file. There is another method to add the jar file; try it if you don’t want to locate the jar file every time you use it. Follow the steps below if you are using NetBeans.

  1. Go to Tools
  2. Click on Libraries
  3. Click on New Library and enter the Library name
  4. Add the Jar at the Classpath tab

That’s how you add the jar file to your NetBeans IDE’s global Libraries.

Jsoup Maven

If you want to use Jsoup Maven, you do not need to download the jar file. Simply insert the dependencies into your Java project, insert the following code into your POM’s <dependencies> section. Simply go to jsoup.org if you don’t know the latest version of the library.

<dependency>
  <groupId>org.jsoup</groupId>
  <artifactId>jsoup</artifactId>
  <version>1.14.3</version>
</dependency>

Gradle

Add the following code, if you are using Java Gradle.

implementation 'org.jsoup:jsoup:1.14.3'

We can use this code below to see if the code is successfully linked to the specified website. The code example below will tell you if we successfully loaded the website.

Document doc = Jsoup.connect("https://kensoftph.com").get();
System.out.println(doc.title());

The output will be the Title of the specified website.

Jsoup Tutorial

Jsoup Example

In this Jsoup example, I will show a few examples of how to use this Java Library. I know you want to learn more about this library, so go to jsoup.org and navigate to Cookbook. The coverage in the following example will be loading the specified website and extracting some of the specified website’s HTML elements. It is commonly known as web scraping program using Java. Continue reading to learn more.

Example: Load a Document from a URL

Loading a document from a URL is quite simple and straightforward, as seen above. If you haven’t read the entire tutorial, please proceed below to learn how to load a document from a URL.

Document doc = Jsoup.connect("https://kensoftph.com").get();
System.out.println(doc.title());

Output

Kensoft PH

Example: Data extraction from a document

This example demonstrates how to extract data from a document. Extracting data from a document is a little hard, but if you follow the example, it will be much easy. To learn more, I recommend visiting jsoup.org’s Cookbook page.

import java.io.IOException;
import java.net.URL;
import java.util.ResourceBundle;
import java.util.logging.Level;
import java.util.logging.Logger;
import javafx.fxml.FXML;
import javafx.fxml.Initializable;
import javafx.scene.control.Label;
import javafx.scene.control.ListView;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

/**
 * FXML Controller class
 *
 * @author KENSOFT
 */
public class FX_ScraperController implements Initializable {

    @FXML
    private Label lblFrom;
    @FXML
    private ListView<String> listView;
    @FXML
    private Label lblSelectedItem;

    /**
     * Initializes the controller class.
     */
    @Override
    public void initialize(URL url, ResourceBundle rb) {
        try {
            // loading the URL
            Document doc = Jsoup.connect("https://kensoftph.com").get();
            //System.out.println(doc.title());
            lblFrom.setText("Contents from: "+doc.title());
            
            //Extracting the H3 tags and add the items to the ListView
            Elements elements = doc.getElementsByTag("h3");
            for(Element element : elements){
                String items = element.text();
                listView.getItems().add(items);
                listView.getSelectionModel().selectedItemProperty().addListener(listener ->{
                    String item = listView.getSelectionModel().getSelectedItem();
                    lblSelectedItem.setText("Selected item: "+item);
                });
            }
        } catch (IOException ex) {
            Logger.getLogger(FX_ScraperController.class.getName()).log(Level.SEVERE, null, ex);
        }
    }
}

Output

This is the output of extracting a data from an HTML document loaded from a URL.

Java Jsoup Tutorial

If you wish to download the JavaFX Project of the Simple Web Scraping application. I’ve provided a download link below so you can quickly get started and tweak it yourself. If you like my tutorials, don’t hesitate to click here or watch my YouTube Video below to learn more via video based in this article.

YouTube Video

]]>
https://kensoftph.com/how-to-use-java-jsoup-tutorial-for-beginners/feed/ 2 JavaFX Tutorial: Jsoup tutorial in Java | Web Scraping App nonadult