Skip to content

Instantly share code, notes, and snippets.

import java.util.Random;
public class HelloWorld {
public static void main(String... args) {
System.out.println(randomString(-229985452) + ' ' + randomString(-147909649));
}
public static String randomString(int seed) {
Random rand = new Random(seed);
StringBuilder sb = new StringBuilder();
{"msgId":"49bbf0ed-0121-467a-ba6a-72a1c3253d6b","sidPrefix":0,"stateId":"181fb2f8-6ff9-43a0-afc8-a0f7bfd3cb0f","version":418,"modified":1648946600843,"accepted":0,"region":101,"modifier":0,"message":{"accountId":"7819b386-8372-0004-d7c8-22966c77d7af","from":"DOORDASH","fromUpdates":[{"timestamp":1648852202215,"from":"DOORDASH"}],"to":"+6147521XXXX","toUpdates":[{"timestamp":1648852200668,"to":"+6147521XXXX"}],"text":"","validityPeriod":14400,"numberOfSegments":9,"twilioApiVersion":2,"flags":{"contentDiscard":0,"addressObfuscate":0},"useCaseTag":"undeclared","direction":0,"mediaMetadata":[]},"account":{"accountStatus":"ACTIVE","currency":"USD"},"billing":{"billingEvents":[{"transactionId":"SM49bbf0ed0121467aba6a72a1c3253d6b","billableItem":{"id":"2fe62aa3-14b6-cb26-611a-8eaad13ff5d7","suffix":""},"price":746.210048,"quantity":9}]},"sms":{"attempts":[{"segments":[{"content":{"text":"","udh":"\u0005\u0000\u0003\u0001\t\u0001"}},{"content":{"text":"","udh":"\u0005\u0000\u0003\u0001\t\u0002"}},{"content":{"text":"
@envomp
envomp / multivariate-statistics-project-data.ipynb
Created December 16, 2022 22:21
Multivariate statistics project data.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@envomp
envomp / multivariate-statistics-project.ipynb
Created December 17, 2022 10:39
Multivariate statistics project.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@envomp
envomp / multivariate-statistics-project.ipynb
Created December 28, 2022 06:51
Multivariate statistics project.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23/10/09 07:01:22 WARN TaskSetManager: Lost task 1948.0 in stage 10.0 (TID 62283) ([2600:1f18:499a:3100:bdf2:5937:d22a:1e78] executor 232): java.lang.RuntimeException: org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve files in partition s3a://com.twilio.messaging.mdp.datalake.tables.mdr-finalized.pii/data/2023/10/7/1 from metadata
at org.apache.hudi.client.utils.LazyIterableIterator.next(LazyIterableIterator.java:121)
at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:46)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:513)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:183)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(S
Pretraining a question answering model
- Selected model size of 70mil is not nearly enough for a model to fully comprehend context.
- It was enough for model to start talking on correct topic and form coherent answers.
- 3 to 4 epochs over smaller dataset is more than enough. Anything beyond that leads to over-training and regression.
- dataset shouldn't allow for "easy win" answers, such as a specific string being the correct answer 50% of the time.
- Having inappropriately large of a vocabulary compared to model size probably had a negative affect also.
MixtralForCausalLM(
######################################### original :
Solve the following match question by detailing every reasoning step.
Question:
Let's imagine a population of 100 humans. At the start of every epoch, every human gives birth to a child. We have to murder X of the children before they grow up to humans by the end of the epoch. If we want to have exactly 1000 humans after 10 epochs, then what is the value of X?
Answer:
At the start of each epoch, the population increases by 100, so after 10 epochs, the population will be 100 * 10 = 1000.
[2, 3, 128000, 128002, 128009] [-100, -100, -100, 128002, 128009]
[0, 7, 128000, 128002, 128009] [-100, -100, -100, 128002, 128009]
[1, 7, 128000, 128002, 128009] [-100, -100, -100, 128002, 128009]
[2, 1, 128000, 128003, 128009] [-100, -100, -100, 128003, 128009]
[7, 6, 128000, 128002, 128009] [-100, -100, -100, 128002, 128009]
[0, 3, 128000, 128003, 128009] [-100, -100, -100, 128003, 128009]
[4, 4, 128000, 128002, 128009] [-100, -100, -100, 128002, 128009]
[4, 2, 128000, 128002, 128009] [-100, -100, -100, 128002, 128009]
[3, 7, 128000, 128003, 128009] [-100, -100, -100, 128003, 128009]
[2, 5, 128000, 128002, 128009] [-100, -100, -100, 128002, 128009]
Dataset:
<s> I exist </s>
<s> Not that I want to </s>
<s> I want food </s>
<s> It is not what I want </s>