Created
September 29, 2011 15:33
-
-
Save discoposse/1251006 to your computer and use it in GitHub Desktop.
PowerShell - split large text/log file
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
############################################# | |
# Split a log/text file into smaller chunks # | |
############################################# | |
# | |
# WARNING: This will take a long while with extremely large files and uses lots of memory to stage the file | |
# | |
# Set the baseline counters | |
# | |
# Set the line counter to 0 | |
$linecount = 0 | |
# Set the file counter to 1. This is used for the naming of the log files | |
$filenumber = 1 | |
# Prompt user for the path | |
$sourcefilename = Read-Host "What is the full path and name of the log file to split? (e.g. D:\mylogfiles\mylog.txt)" | |
# Prompt user for the destination folder to create the chunk files | |
$destinationfolderpath = Read-Host "What is the path where you want to extract the content? (e.g. d:\yourpath\)" | |
Write-Host "Please wait while the line count is calculated. This may take a while. No really, it could take a long time." | |
# Find the current line count to present to the user before asking the new line count for chunk files | |
Get-Content $sourcefilename | Measure-Object | ForEach-Object { $sourcelinecount = $_.Count } | |
#Tell the user how large the current file is | |
Write-Host "Your current file size is $sourcelinecount lines long" | |
# Prompt user for the size of the new chunk files | |
$destinationfilesize = Read-Host "How many lines will be in each new split file?" | |
# the new size is a string, so we convert to integer and up | |
# Set the upper boundary (maximum line count to write to each file) | |
$maxsize = [int]$destinationfilesize | |
Write-Host File is $sourcefilename - destination is $destinationfolderpath - new file line count will be $destinationfilesize | |
# The process reads each line of the source file, writes it to the target log file and increments the line counter. When it reaches 100000 (approximately 50 MB of text data) | |
$content = get-content $sourcefilename | % { | |
Add-Content $destinationfolderpath\splitlog$filenumber.txt "$_" | |
$linecount ++ | |
If ($linecount -eq $maxsize) { | |
$filenumber++ | |
$linecount = 0 | |
} | |
} | |
# Clean up after your pet | |
[gc]::collect() | |
[gc]::WaitForPendingFinalizers() | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is amazing. I had tweaked a bit and added few lines to make it static.
Also, Added the logic to split the file in greater than 120 lines, start with 0 and ends with 7999-99 (my txt file having common txt).
Modified the output file name as well including date format, Too.
Thank you for the great script !!!! It made my day.