Skip to content

Instantly share code, notes, and snippets.

@slimlime
Last active March 27, 2021 08:03
Show Gist options
  • Save slimlime/58a3b9ffbb09337f58a255453a8f19f1 to your computer and use it in GitHub Desktop.
Save slimlime/58a3b9ffbb09337f58a255453a8f19f1 to your computer and use it in GitHub Desktop.
Generate non-overlapping (gapped) sequences. Partitioned sequences auto-incrementing numbers. generate-non-overlapping-sequences.js
/**
* Quick no doco utility fn. Not sure what the SEO for searchability of this concept is.
* Reminds me of a hacky workaround for distributed database reconciliation of separately incrementing distinct IDs
* Can probably shift some parameters to different places.
* Functional! FP
* - TODO: Rename to "get____" function name convention. ref transp
*/
function generateSteppedSequence(
stepNumberOfSequences,
originalBaseLength,
baseOffsetStartingNumber,
sequenceIdentifier
) {
// Valid range sequenceIdentifier should be 1 2 3 if base length is 3
const sequenceOffsetForNoOverlaps = sequenceIdentifier - 1;
const individualPartitionedSteppedLength = Math.ceil(originalBaseLength / stepNumberOfSequences);
const separatedNumbers = Array.from(
{
length: individualPartitionedSteppedLength,
},
(empty, index) => {
const calcJump =
index * stepNumberOfSequences +
sequenceOffsetForNoOverlaps +
baseOffsetStartingNumber;
return calcJump;
}
);
return separatedNumbers;
}
function generateNonOverlappingSequences(
numSequences,
totalBaseLength,
baseOffset
) {
const parentArrayOfSequences = Array.from(
{ length: numSequences },
(empty, index) =>
generateSteppedSequence(
numSequences,
totalBaseLength,
baseOffset,
index
)
);
return parentArrayOfSequences;
}
/**
* Building word list
* @param listOfSequences {Array<number[]>} the sequences we generated
* naming things is hard
*/
function getListOfTextNewLinedStringifiedSequencesFromListOfSequences(listOfSequences) {
const listOfInterspersedNewLinedSequences = listOfSequences.map(sequence => sequence.join("\n"));
return listOfInterspersedNewLinedSequences;
}
/**
* Quick fn
*/
function directMapListOfUrlsFromTokenIDs(sequence, strPrefix="https://.../?&blah=") {
return sequence
.map(
(num) =>
`${strPrefix}${num}`
)
// New line accumulated into single string
.join("\n");
}
@slimlime
Copy link
Author

https://gist.github.com/slimlime/58a3b9ffbb09337f58a255453a8f19f1#file-generate-non-overlapping-sequences-js-L21

length of each individual sequence should be the same or similar? off by one round truncate integer depending on division split ? any edge cases? Math maffs laffs all round down all okay.. whoops

@slimlime
Copy link
Author

could micro optimise the division out of the nested fn

@slimlime
Copy link
Author

slimlime commented Mar 27, 2021

generateNonOverlappingSequences(3, 10, 1);

(3) [Array(4), Array(4), Array(4)]
0: (4) [0, 3, 6, 9]
1: (4) [1, 4, 7, 10]
2: (4) [2, 5, 8, 11]

@slimlime
Copy link
Author

numSequences = number of splits

@slimlime
Copy link
Author

alternative techniques

@slimlime
Copy link
Author

oop maybe off by one

@slimlime
Copy link
Author

did not expect notepad++ to freeze on simple regex replace for less than a million lines :D

@slimlime
Copy link
Author

oof 16 million characters for one of the splits? whoops!

@slimlime
Copy link
Author

nothing to see here
not generating wordlists 👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment