Skip to content

Instantly share code, notes, and snippets.

View lemonteaa's full-sized avatar

Lemontea lemonteaa

View GitHub Profile
@lemonteaa
lemonteaa / olmo.md
Last active May 7, 2024 18:46
Turning LLM base model into chatbot

Turning LLM base model into chatbot

olmo is a "true open source" model that is oriented for scientific research by having many intermediate training checkpoints available, and the training dataset is also open etc.

Attached script (olmo.py) demo steering a base model into behaving like a chatbot using prompt engineering alone. Basic chating actually work already without the few-shots examples, but without those the way it answer may not be as "natural". However, I find that guiding chatbot behavior through the examples is quite unstable and brittle, and it seems it does occupy its limited cognitive bandwidth so the intelligence of its actual answer content will be reduced.

Feel free to experiment more with the prompt.

Dependencies

Prompt IDE Idea

Goal and Non-goal

Not to become an enterprisy tool - so probably not fancy collaboration or auto-eval by LLM itself. Also focus on locally hosted open source LLM models and use cases like tool-using (retrieval augmented chatbot), document ingest, agent.

Single Prompt Editing

Basic Features

  • Automatic output parsing and logging

LLM Application Framework Proposal

Motivation

Improve from langchain?

Because LLM deserves a more pleasant development experience.

Disclaimer

@lemonteaa
lemonteaa / challenge-mode.md
Last active September 16, 2022 15:20
What a good fullstack Clojure(script) looks like in 2022?

Advanced Baseline (Clojure(script)) [^1]

Clojure(script) [^2] have matured as a language and is no longer as hyped. At this point in time, there ought to be consolidation towards an easy path (Ya, Simple not easy [^3] I know, but try telling that to someone in a rush (me)). This notes sketches one to the best of my personal knowledge.

Disclaimer: I haven't followed it up in the last 2-3 years, so it may have been lagging behind for one full generation of tech. That being said, Clojure tends to be more stable, slow, and methodical in its design process, so libraries tend to have a longer shell life.

Context/Assumption

  • You're in a greenfield project, with progressive management, and so are free to aggressively adopt best-in-class architecture/design/practises.
  • You want to have a mostly stable base upon which to build - sacrificing some "advanced features" is acceptable. This affect the selection of libraries.

Revision: Setting up Java dev env with code-server + sdkman

Context

A trend in recent years is the rise of remote development. One important component of it is the use of isolated container/VM (possibly ephemeral) for the sole purpose of development. Moreover, all the setups necessary beforehand are made standardised and repeatable using scripts/IoC etc. When done right, this drastically reduces friction to on-board developer joining a project by making the "Getting Started" process as simple as clicking a button - one is then led to a webpage running vscode, backed up by a container/VM running in the cloud, with everything already setup and ready to use.

In this note, we will perform the steps in setting up a standard Java development enviornment manually. This can be useful in a pinch - you got your hands on a VPS/plain vscode service, but which lack those automation/integrations.

Installing, configuring, and exposing code-server

@lemonteaa
lemonteaa / kubectl-cheat.md
Last active August 8, 2022 19:43
kubectl cheatsheet

Kubectl cheatsheet

Just some quick n dirty stuff. (May move this note to main repo later)

Sometimes you just want to quickily spin up something in k8s, and writing all those elaborate yaml file is just too much friction. You can do a lots just from kubectl though:

kubectl create deployment <deployment name> --image=yourorg/imagename:ver

Then expose it:

@lemonteaa
lemonteaa / html5-integration.md
Last active August 8, 2022 01:46
Web Tech Infra: Leveraging html5 api for media rich application

Web Tech Infra: Leveraging html5 api for media rich application

As flash have been unsupported on browser since its phasing out completed years ago, various html5 API provide pieces of the puzzle for its replacement. Nonetheless, these API and new standards are emerging technologies - some of them are still evolving/not mature. Therefore, as of 2022, there still appear to be a gap in terms of a robust ecosystem of well known library/framework/documentations for end-to-end applications.

There are notable exceptions though. For example, html5 game do have frameworks that worked these details out for the most part (and hide it under some abstractions). But not all media-rich application is a game.

Perhaps another concern is that this domain is diverse enough that there is no such thing as a "one-size-fit-all" approach.

Below I sketch one possible combinations, or piecing together, of disparate web techs. (Still learning - expect inaccuracies or even naivetee)

@lemonteaa
lemonteaa / setup-git.sh
Created July 4, 2022 19:47
Setting up git credential
mkdir ~/.ssh/
cp ssh_conf ~/.ssh/config
chmod 600 ~/.ssh/config
touch ~/.ssh/id_ed25519.pub
touch ~/.ssh/id_ed25519
chmod 600 ~/.ssh/id_ed25519
@lemonteaa
lemonteaa / proposal.md
Last active January 2, 2022 07:55
Project Proposal: Client-side and Blockchain integrated Learning Management System (applied to tech bootcamp)

Project Proposal: Client-side and Blockchain integrated Learning Management System (applied to tech bootcamp)

Motivation

I want to have a learning website for a custom made/DIY tech bootcamp. Two major requirements are that:

  • It should have lots of hands-on lab components so that students can work on the projects/exercises without leaving the browser.
  • And yet, we want to have the whole thing be a static website if possible because we do not have the resources to manage a server (both cloud costs and maintainence/scalability concern).

To clarify, using other people's free cloud resources is allowed, but we also want the website to be sustainable and built for longenvity - it should mostly rely on well-funded free resources that have a low risk of going out, or at least, should be robust against them folding by having alternatives that we can replace.

@lemonteaa
lemonteaa / blockchain_linkdump1.md
Last active March 5, 2021 16:27
Dump of blockchain related links I saved, dug out from a deep burrow

Blockchain tech link dump

Context

Someone on lihkg asked whether there is a technically oriented blockchain tg group. I guess there probably is, but still, I remember having spent quite some time trying to navigate this space, and it can be very addictive. It takes some experience to separate out the technical meat from just narcissistic self-aggrandizement. Although it is a maze, in the end I did come out with the knowledge that there are some legitimate insights, and I saved those links in a note somewhere, which quickily got buried. Today I'm digging it out and dumping it here for the record.

(Note: the full list has been slightly abridged and some not-too-relevant links pruned)

(2nd version: finished dumping the original note file) (3rd: dug out 2 more links by searching google... I should have saved those, not sure why it's not found on my notes)