Skip to content

Instantly share code, notes, and snippets.

@yunkaiOr2
Created August 11, 2021 14:31
Show Gist options
  • Save yunkaiOr2/d9c43b308c1859c88492842adb01327b to your computer and use it in GitHub Desktop.
Save yunkaiOr2/d9c43b308c1859c88492842adb01327b to your computer and use it in GitHub Desktop.
read tar.gz file
import org.apache.commons.compress.archivers.tar.TarArchiveEntry
import org.apache.commons.compress.archivers.tar.TarArchiveInputStream
import org.apache.commons.io.FileUtils
import java.nio.file.Files
import java.util.zip.GZIPInputStream
url = new URL("http://xxx.tar.gz")
tempFile = Files.createTempFile(null, null)
tempFile.toFile().deleteOnExit()
FileUtils.copyURLToFile(url, tempFile.toFile())
byte[] buffer = new byte[1024];
try {
GZIPInputStream zis = new GZIPInputStream(new FileInputStream(tempFile.toFile()));
TarArchiveInputStream taris = new TarArchiveInputStream(zis)
TarArchiveEntry entry;
ByteArrayOutputStream bao = new ByteArrayOutputStream();
while ((entry = taris.getNextTarEntry()) != null) {
//忽略目录,与crc校验文件
if (entry.isDirectory() || entry.getName().endsWith(".crc")) {
continue;
}
//忽略空文件
if (entry.getSize() == 0) {
continue;
}
int len;
while ((len = taris.read(buffer)) >0) {
bao.write(buffer, 0, len);
}
}
long lineNumber = new LineNumberReader(new InputStreamReader(new ByteArrayInputStream(bao.toByteArray()))).lines().count();
println lineNumber
} catch (Exception e) {
throw new RuntimeException("gzip read fail", e);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment