java怎么读取excel技术

Java读取Excel文件的方法有多种，包括使用Apache POI、JExcelApi和其他库。最常用、功能最强大的是Apache POI库。在Java中读取Excel文件时，选择合适的库、理解Excel文件结构、处理大数据量是关键。下面将详细介绍如何使用Apache POI库来读取Excel文件，并提供一些代码示例和最佳实践。

一、选择合适的库

对于Java读取Excel文件，Apache POI是最常用的开源库。它支持Excel 97-2003（.xls）和Excel 2007及更高版本（.xlsx）。另一种选择是JExcelApi，但它仅支持Excel 97-2003格式，功能上也不如Apache POI强大。因此，本文主要讨论如何使用Apache POI来读取Excel文件。

二、理解Excel文件结构

在读取Excel文件之前，了解其基本结构非常重要。一个Excel文件包含一个或多个工作簿（Workbook），每个工作簿包含多个工作表（Sheet），每个工作表由行（Row）和单元格（Cell）组成。通过Apache POI，我们可以逐层访问这些结构。

三、安装Apache POI库

在开始编写代码之前，需要在项目中引入Apache POI库。如果你使用Maven进行项目管理，可以在pom.xml文件中添加以下依赖：

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>5.0.0</version>
</dependency>
<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-collections4</artifactId>
    <version>4.4</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>5.0.0</version>
</dependency>
<dependency>
    <groupId>org.apache.xmlbeans</groupId>
    <artifactId>xmlbeans</artifactId>
    <version>4.0.0</version>
</dependency>

四、读取Excel文件的基本步骤

1、加载Excel文件

首先，需要加载Excel文件并创建一个工作簿对象。Apache POI提供了WorkbookFactory类来自动识别文件类型（.xls或.xlsx）。

import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
public class ExcelReader {
    public static void main(String[] args) {
        String filePath = "path/to/excel/file.xlsx";
        try (FileInputStream fis = new FileInputStream(new File(filePath))) {
            Workbook workbook = WorkbookFactory.create(fis);
            // Use the workbook object to read sheets, rows, and cells
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

2、读取工作表

从工作簿中读取工作表。可以通过索引或工作表名称来获取特定的工作表。

import org.apache.poi.ss.usermodel.Sheet;
public class ExcelReader {
    public static void main(String[] args) {
        String filePath = "path/to/excel/file.xlsx";
        try (FileInputStream fis = new FileInputStream(new File(filePath))) {
            Workbook workbook = WorkbookFactory.create(fis);
            Sheet sheet = workbook.getSheetAt(0); // 获取第一个工作表
            // 或者通过名称获取工作表
            // Sheet sheet = workbook.getSheet("Sheet1");
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

3、读取行和单元格

读取工作表中的行和单元格。可以使用Row和Cell对象来访问具体的数据。

import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Cell;
public class ExcelReader {
    public static void main(String[] args) {
        String filePath = "path/to/excel/file.xlsx";
        try (FileInputStream fis = new FileInputStream(new File(filePath))) {
            Workbook workbook = WorkbookFactory.create(fis);
            Sheet sheet = workbook.getSheetAt(0);
            for (Row row : sheet) {
                for (Cell cell : row) {
                    switch (cell.getCellType()) {
                        case STRING:
                            System.out.print(cell.getStringCellValue() + "t");
                            break;
                        case NUMERIC:
                            System.out.print(cell.getNumericCellValue() + "t");
                            break;
                        case BOOLEAN:
                            System.out.print(cell.getBooleanCellValue() + "t");
                            break;
                        default:
                            System.out.print("UNKNOWNt");
                            break;
                    }
                }
                System.out.println();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

五、处理大数据量

在处理大数据量的Excel文件时，需要注意内存消耗。Apache POI提供了SXSSFWorkbook类来处理大数据量的Excel文件，它使用磁盘上的临时文件来存储数据，从而减少内存消耗。

import org.apache.poi.xssf.streaming.SXSSFWorkbook;
import org.apache.poi.ss.usermodel.Sheet;
public class ExcelReader {
    public static void main(String[] args) {
        String filePath = "path/to/large/excel/file.xlsx";
        try (FileInputStream fis = new FileInputStream(new File(filePath))) {
            SXSSFWorkbook workbook = new SXSSFWorkbook(new XSSFWorkbook(fis));
            Sheet sheet = workbook.getSheetAt(0);
            for (Row row : sheet) {
                for (Cell cell : row) {
                    switch (cell.getCellType()) {
                        case STRING:
                            System.out.print(cell.getStringCellValue() + "t");
                            break;
                        case NUMERIC:
                            System.out.print(cell.getNumericCellValue() + "t");
                            break;
                        case BOOLEAN:
                            System.out.print(cell.getBooleanCellValue() + "t");
                            break;
                        default:
                            System.out.print("UNKNOWNt");
                            break;
                    }
                }
                System.out.println();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

六、处理不同数据类型的单元格

在读取单元格时，需要处理不同的数据类型。Apache POI提供了多种方法来获取单元格的值，例如getStringCellValue()、getNumericCellValue()、getBooleanCellValue()等。可以使用CellType枚举来判断单元格类型。

import org.apache.poi.ss.usermodel.CellType;
public class ExcelReader {
    public static void main(String[] args) {
        String filePath = "path/to/excel/file.xlsx";
        try (FileInputStream fis = new FileInputStream(new File(filePath))) {
            Workbook workbook = WorkbookFactory.create(fis);
            Sheet sheet = workbook.getSheetAt(0);
            for (Row row : sheet) {
                for (Cell cell : row) {
                    if (cell.getCellType() == CellType.STRING) {
                        System.out.print(cell.getStringCellValue() + "t");
                    } else if (cell.getCellType() == CellType.NUMERIC) {
                        System.out.print(cell.getNumericCellValue() + "t");
                    } else if (cell.getCellType() == CellType.BOOLEAN) {
                        System.out.print(cell.getBooleanCellValue() + "t");
                    } else {
                        System.out.print("UNKNOWNt");
                    }
                }
                System.out.println();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

七、处理合并单元格

在Excel文件中，有时会遇到合并单元格的情况。Apache POI提供了RegionUtil类来处理合并单元格，可以使用getMergedRegion方法来获取合并单元格的范围。

import org.apache.poi.ss.util.CellRangeAddress;
public class ExcelReader {
    public static void main(String[] args) {
        String filePath = "path/to/excel/file.xlsx";
        try (FileInputStream fis = new FileInputStream(new File(filePath))) {
            Workbook workbook = WorkbookFactory.create(fis);
            Sheet sheet = workbook.getSheetAt(0);
            for (Row row : sheet) {
                for (Cell cell : row) {
                    if (isMergedRegion(sheet, cell.getRowIndex(), cell.getColumnIndex())) {
                        CellRangeAddress region = getMergedRegion(sheet, cell.getRowIndex(), cell.getColumnIndex());
                        System.out.print("Merged Region: " + region.formatAsString() + "t");
                    } else {
                        System.out.print(cell.toString() + "t");
                    }
                }
                System.out.println();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    private static boolean isMergedRegion(Sheet sheet, int row, int col) {
        for (int i = 0; i < sheet.getNumMergedRegions(); i++) {
            CellRangeAddress region = sheet.getMergedRegion(i);
            if (region.isInRange(row, col)) {
                return true;
            }
        }
        return false;
    }
    private static CellRangeAddress getMergedRegion(Sheet sheet, int row, int col) {
        for (int i = 0; i < sheet.getNumMergedRegions(); i++) {
            CellRangeAddress region = sheet.getMergedRegion(i);
            if (region.isInRange(row, col)) {
                return region;
            }
        }
        return null;
    }
}

八、总结

Java读取Excel文件是一个常见的需求，Apache POI库提供了强大的功能来处理各种Excel文件格式和数据类型。在实际应用中，选择合适的库、理解Excel文件结构、处理大数据量和不同数据类型的单元格是关键。通过本文的介绍和代码示例，相信你已经掌握了如何使用Apache POI库来读取Excel文件。如果你有更多需求，可以参考Apache POI的官方文档和示例代码。