danidiaz/_Architecture.md

## _Architecture.md

      
    Raw
  

              _Architecture.md
            
          
    Vertical decomposition. Creating cohesive services

One of the biggest misconceptions about services is that a service is an independent deployable unit, i.e., service equals process. With this view, we are defining services according to how components are physically deployed. In our example, since it’s clear that the backend admin runs in its own process/container, we consider it to be a service.


But this definition of a service is wrong. Rather you need to define your services in terms of business capabilities. The deployment aspect of the system doesn’t have to be correlated to how the system has been divided into logical services. For example, a single service might run in different components/processes, and a single component might contain parts of multiple services. Once you start thinking of services in terms of business capabilities rather than deployment units, a whole world of options open.


What are the Admin UI, Admin Backend and Website UI, Website Backend components? They basically act as containers of services. They are maintained by their own teams and their sole purpose is to coordinate between services. These components are business-logic agnostic.

Avoiding Microservice Megadisasters unsure about the approach to search and data duplication
Microservices and Rules Engines – a blast from the past

"search engine, not search service"
"allows each microservice to put a component into it, and the search engine will run that set of rules"
"what we are talking about here is not the whole microservice, but the search component of that service"
"that way, the search engine doesn't need in and of itself access to all of that data directly"

Don't build a distributed monolith

"Don't couple systems with binary dependencies"


Alas this seems to go against the "thinking of services in terms of business capabilities rather than deployment units" principle. If the deployment is intertwined, is seems that there will be binary dependencies.

The Art of the node.js Rescue
The entity service antipattern
Five pieces of advice for new technical leads
The System Design Primer hn

in general, re-organizing the architecture of a system is usually possible - if and only if - the underlying data model is sane.


What is the convention for addressing assets and entities? Is it consistent and useful for informing both security or data routing?


What is the security policy for any specific entity in your system? How can it be modified? How long does it take to propagate that change? How centralized is the authentication?


If a piece of "data" is found, how complex is it to find the origin of this data?


What is the policy/system for enforcing subsystems have a very narrow capability to mutate information?

More than concentric layers The Software Architecture Chronicles
Managing the Complexity of Microservices Deployments
Designing Microservice Architectures the Right Way slides
To Test A System, You Need A Good Design Shunt pattern Two-level test suites?

For a test environment, you can inject an “In-Memory Data Source.” For production, you can use the “HTTP Server Data Source.”

How Contract Tests Improve the Quality of Your Distributed Systems
SOLID Architecture in Slices not Layers

For too long we've lived under the tyranny of n-tier architectures. Building systems with complicated abstractions, needless indirection and more mocks in our tests than a comedy special. But there is a better way - thinking in terms of architectures of vertical slices instead horizontal layers. Once we embrace slices over layers, we open ourselves to a new, simpler architecture, changing how we build, organize and deploy systems.

Scaling without cross-functional teams
Growing Object-Oriented Software, Guided by Tests Without Mocks Unit testing anti-patterns: Structural Inspection
Test automation without a headache: Five key patterns
What’s your release process like?
ndepend and code analysis
Lessons from Building Static Analysis Tools at Google
Writing Documentation When You Aren't a Technical Writer  hn. Semantic linefeeds. guidelines.

automated the checks as much as possible with linters [2].

Age of Invisible Disasters
Conforming container antipattern
microfrontends
Break Up With Your Frontend Monolith - Elisabeth Engel
Compositional UIs - the Microservices Last Mile - Jimmy Bogard
Explicitly Yours
jdepend

Before using JDepend, it is important to understand that "good" design quality metrics are not necessarily indicative of good designs. Likewise, "bad" design quality metrics are not necessarily indicative of bad designs. The design quality metrics produced by JDepend should not be used as yard sticks by which all designs are measured.

Reconstructing thalia.de with self-contained systems
Optimizing for iteration speed

one of the most scary thing in software engineering is “inventory” of code that builds up without going into production. It represents deployment risk, but also risk of building something users don’t want. Not to mention lost user value from not shipping parts of the feature earlier (user value should be thought of as feature value integrated over time, not as the feature value at the end state).

Thinking Architecturally
Unit test your Java architecture tweet

If your primary motivation for building microservices is to enforce modular architectures, think twice. Modularity is solved within the JVM (JPMS, OSGi, JBoss Modules; even multimodule builds get you far), don't pay the price of distributed computing + remote calls just for this.

Majestic Modular Monolith!
SonarJS
Apache Kafka als Backend für Webanwendungen?
How Events Are Reshaping Modern Systems by Jonas Bonér
Serverless
Complex Event Flows in Distributed Systems
Designing Events-first Microservices. Journey to Event Driven – Part 1
Microservices in a Post-Kubernetes Era

In the post-Kubernetes era, using libraries to implement operational networking concerns (such as Hystrix circuit breaking) has been completely overtaken by service mesh technology.

the testing renaissance. tweets about testing.
HANDS-ON INTRO TO KUBERNETES & OPENSHIFT.
Tell Don't Ask. more. How Interfaces Are Refactoring Our Code. The art of embugging. GetterEradicator.
AWS Solution Architect Associate exam.
Hybrid Networking Reference Architectures.
Docs as Code – Architekturdokumentation leicht gemacht.
No More Silos: How to Integrate Your Databases with Apache Kafka and CDC.
Streaming Data Clears the Path for Legacy Systems Modernization.
Integrating legacy and CQRS.
Streaming MySQL tables in real-time to Kafka.
Streaming databases in realtime with MySQL, Debezium, and Kafka. Mención en: ¡Larga vida al legacy!.

no ser el dueño del modelo de datos debería ser algo temporal


parece que el "sistema secundario" es solo de lectura en un principio


el siguiente pase, de alguna manera, tienes que ser capaz de modificar el sistema antiguo... arquitecturasa más complejas con bidireccionalidad. no uses bus de eventos. expon servicios en el sistema nuevo, haz que el sistema legado los accede. no uses bus de eventos (?)


sin eventos obligo al software legado a saber a dónde me he llevado ese trocito que le he quitado.


[para sincronizar] podemos usar eventos, triggers...

GETTING STARTED WITH DDD WHEN SURROUNDED BY LEGACY SYSTEMS - Eric Evans. bubble context. strategy 1 - bubble context. strategic design. bounded context.
Listen to Yourself: A Design Pattern for Event-Driven Microservices.

For example, you cannot guarantee that a commit to Cassandra and a message delivery to Kafka would be done atomically or not done at all.


Let’s take a common use case: Updating a local NoSQL database and also notifying a legacy system of record about the activity.


However, there is still a concrete problem: How do you guarantee atomic execution of both the NoSQL writes and the publishing of the event to the message broker?


Note: Potential duplicate messages are always a possibility with a message broker so you should design your message handling to be idempotent regardless of the solution you choose.


All your events and database writes must be idempotent to avoid duplicate records.


The client isn’t guaranteed to read their own writes immediately.


The transaction log tailing pattern can achieve similar results to those described here. Your transactions will be atomic without resorting to two phase commit. The transaction log tailing pattern has the added benefit of guranteeing your database is committed before returning a response to the client.

Pattern: Transaction log tailing.

each step of a saga must atomically update the database and publish messages/events. It is not viable to use a distributed transaction that spans the database and the message broker.

How to solve two generals issue between event store and persistence layer?.
Event-Driven Data Management for Microservices.

One way to achieve atomicity is for the application to publish events using a multi-step process involving only local transactions. The trick is to have an EVENT table, which functions as a message queue, in the database that stores the state of the business entities.

Domain events: simple and reliable solution.

In an event-driven architecture there is also the problem of atomically updating the database and publishing an event.


[my thoughs] are the options: 1 - "listen to yourself" pattern and 2 - "keeping an internal events table"? Also 3 - "log trailing"?

Paypal talk. slides. Streaming Data Microservices. Oracle Golden Gate.

from the slides:
66: XA transactions: ensure consistency, give up availability
67: Event Sourcing: give up read-your-writes consistency (Is this the "listen to yourself" pattern?)
68: Change Data Capture: read-your-writes + eventual consistency across systems


OLAP engines like Apache Druid, LinkedIn's Pinot


"Use Change Data Capture, rather than XA Transactions or Event Sourcing, for replicating data between data systems where consistency is required, etc., such as financial services... Also use schemas"


logs, not queues!


Unlike queues, consumers don't delete entries; Kafka manages their lifecycles


N Consumers, who start reading where they want


Akka Streams, Kafka Streams - libraries for “data-centric microservices”. Smaller scale, but great flexibility

Kubernetes: Your Next Application Server. video.
How to extract change data events from MySQL to Kafka using Debezium.
about architectures.
Openshift secrets management
one step forward, two steps back.
Introduction to the Kubernetes Operator Framework
Keynote: Maturing Kubernetes Operators - Rob Szumski.
DEATH OF LOGGING, HEXAGONAL ARCHITECTURES, TECHNOLOGY AND ARCHITECTURES.
Kubernetes The Database.
Introduction to Cloud Storage for Developers.
building a CI / CD bot with Kubernetes
docker secrets

managing env vars in production comes down to doing one of two things: using an environment file that is securely stored and securely retrieved, or retrieving each key from a secure secrets management service like Vault, Keywhiz or Cyberarc. The former is easier, as it requires less infrastructure, but requires greater care. The latter requires more infrastructure but handles things like role based access for each key more easily

Istio Multicluster on OpenShift
Kafka for long-term storage. so. hn. How Pinterest runs Kafka at scale. Streaming Hundreds of Terabytes of Pins from MySQL to S3/Hadoop Continuously. Is Kafta a database?. experimentation with event-based systems. The Magical Rebalance Protocol of Apache Kafka. The Death and Rebirth of the Event-Driven Architecture. ETL Is Dead, Long Live Streams. event-based architectures with Kafka and Atom. Complex Event Flows in Distributed Systems. Restoring Confidence in Microservices: Tracing That's More Than Traces.

This is an important requirement for processes that calculate real-time results but need to periodically recalculate results (say when their processing logic changes).


Something to keep in mind as well is that cluster restarts (especially after unclean shutdowns) might take a very long time, as all logs would need to be checked at broker startup. Apart from that I can't think of large reasons not to do this, though I agree that dumping data to S3/HDFS/similar should be the preferred solution


we use Kafka to transport data to our data warehouse, including critical events like impressions, clicks, close-ups, and repins. We also use Kafka to transport visibility metrics for our internal services

Language-oriented software engineering. tweet.
Using ETL Staging Tables
The future of Kubernetes is Virtual Machines hn
Developing applications on OpenShift in an easier way
Mastering Spring framework 5, Part 2: Spring WebFlux
Day Two Kubernetes: Tools for Operability.
kubernetes guideposts 2019 Simple Multi-tenancy with Django Running on OpenShift
KUBERNETES FAILURE STORIES. hn.
Cloud native Java EE on OpenShift Adam Bien.
making the most of Kubernetes clusters
Scaling a Distributed Stream Processor in a Containerized Environment
Kubinception.
kubernetes vs. docker
rethinking legacy and monolithic systems
"how do I propagate state across asynchronous, reactive execution pipelines?". video. Spring Tips: Testing Reactive Code. RxJava vs Reactor. Reactive Spring: Eine Einführung in die reaktive Programmierung. Point-to-Point Messaging Architecture - The Reactive Endgame. building reactive pipelines tweet. reactive DDD. How (not) to use Reactive Streams in Java 9+. Assembly time Subscription time Execution time) 404. construyendo pipelines reactivos slides. Spring Tips: Reactive MySQL with Jasync SQL and R2DBC. reactive streams operators. RxJava by example. reactive jdbc tweet. 5 reasons to use RxJava in your projects. reactive programming - lessons learned. reactive transactions. reactive-revolution course materials. marble diagrams. reactive streams and Kotlin flows. Event Driven with Spring. How to build Reactive Server in 50 minutes. moving from imperative to reactive. reactive programming - lessons learned. more. slides. Five Things About RxJS and Reactive Programming. Going full reactive with Spring Webflux and the new CosmosDB API v3. reactive streams basic concepts. Streaming data as one additional use case for #reactive programming. building reactive pipelines. the value of reactive systems. reactor. Do's and Don'ts: Avoiding First-Time Reactive Programmer Mines.
Learn Openshift operator framework
certified

In big companies, 95% of apps are still old school: firewall -- load balancer -- 5 front ends -- 3 back ends -- two database servers.

"everybody wants to get rid of ELK for logging quite soon".
airhacks tv 59 docker vs. openshift effective web standards
metrics for the masses
Service Catalog and Kubernetes

Kubernetes declarative object configuration model is one of the most interesting features of the orchestrator

12 ways to get smarter about Kubernetes.
Microservices in a Post-Kubernetes Era.
How Kubernetes can break: networking
Automating stateful applications with Kubernetes operators Reaching for the Stars with Ansible Operator
Why are we templating YAML?.
An Incremental Architecture Approach to Building Systems.
Various links about persistence and DDD:
https://tech.transferwise.com/hibernate-and-domain-model-design/
https://stackoverflow.com/questions/10099636/are-persistence-annotations-in-domain-objects-a-bad-practice
https://stackoverflow.com/questions/14737652/entity-objects-vs-value-objects-hibernate-and-spring
https://stackoverflow.com/questions/31400432/ddd-domain-entities-vo-and-jpa
https://stackoverflow.com/questions/2597219/is-it-a-good-idea-to-migrate-business-logic-code-into-our-domain-model
https://stackoverflow.com/questions/821276/why-should-i-isolate-my-domain-entities-from-my-presentation-layer
https://softwareengineering.stackexchange.com/questions/350067/is-it-good-practice-to-use-entity-objects-as-data-transfer-objects
https://softwareengineering.stackexchange.com/questions/378866/understanding-ddd-when-using-an-orm-such-as-hibernate
https://softwareengineering.stackexchange.com/questions/171457/what-is-the-point-of-using-dto-data-transfer-objects
https://softwareengineering.stackexchange.com/questions/140826/do-orms-enable-the-creation-of-rich-domain-models
https://blog.pragmatists.com/refactoring-from-anemic-model-to-ddd-880d3dd3d45f
https://enterprisecraftsmanship.com/2016/04/05/having-the-domain-model-separate-from-the-persistence-model/
Custom Implementations for Spring Data Repositories.
Three-Part Architecture of the Next Generation Data Center Inside NetApp.
Conquering the Challenges of Data Preparation for Predictive Maintenance
Java 9: Bessere Domänenmodelle mit Java-9-Modulen.
Links about the strangler pattern https://news.ycombinator.com/item?id=19122973 strangler pattern https://news.ycombinator.com/item?id=19125333 https://paulhammant.com/2013/07/14/legacy-application-strangulation-case-studies/ https://www.michielrook.nl/2016/11/strangler-pattern-practice/ https://trunkbaseddevelopment.com/strangulation/ https://www.leadingagile.com/2018/10/the-urge-to-stranglethe-strangler-pattern/ https://www.martinfowler.com/bliki/StranglerApplication.html https://twitter.com/martinfowler/status/357142664665251841 https://blog.overops.com/strangler-pattern-how-to-keep-sane-with-legacy-monolith-applications/ https://blogs.sap.com/2017/09/25/strangler-applications-monolith-to-microservices/
Links about DTO mappers https://auth0.com/blog/automatically-mapping-dto-to-entity-on-spring-boot-apis/ https://www.baeldung.com/entity-to-and-from-dto-for-a-java-spring-application http://modelmapper.org/ https://medium.com/@hackmajoris/a-generic-dtos-mapping-in-java-11d649b8a486 https://stackoverflow.com/questions/2828403/dto-and-mapper-generation-from-domain-objects https://stackoverflow.com/questions/14523601/bo-dto-mapper-in-java https://stackoverflow.com/questions/15117403/dto-pattern-best-way-to-copy-properties-between-two-objects https://stackoverflow.com/questions/1432764/any-tool-for-java-object-to-object-mapping https://stackoverflow.com/questions/678217/best-practices-for-mapping-dto-to-domain-object https://codereview.stackexchange.com/questions/64731/mapping-interface-between-pojos-and-dtos https://softwareengineering.stackexchange.com/questions/171457/what-is-the-point-of-using-dto-data-transfer-objects https://www.jhipster.tech/using-dtos/ http://appsdeveloperblog.com/java-objects-mapping-with-modelmapper/ http://www.adam-bien.com/roller/abien/entry/creating_dtos_without_mapping_with The Ping class is a JPA entity and JSON-B DTO at the same time: http://www.adam-bien.com/roller/abien/entry/creating_dtos_without_mapping_with DTOs are also motivated by their typesafe nature. Lacking typesafety, JSON-P JsonObjects are not used as DTOs. https://www.credera.com/blog/technology-solutions/mapping-domain-data-transfer-objects-in-spring-boot-with-mapstruct/ https://rmannibucau.wordpress.com/2014/04/07/dto-to-domain-converter-with-java-8-and-cdi/ https://vladmihalcea.com/the-best-way-to-map-a-projection-query-to-a-dto-with-jpa-and-hibernate/ https://github.com/porscheinformatik/anti-mapper jpa hashCode euquals dilemma
What's new in Spring Data
Running your own DBaaS based on your preferred DBs, Kubernetes operators and containerized storage.
Microservices in a Post-Kubernetes Era

In the post-#Kubernetes era, using libraries to implement operational networking concerns (such as Hystrix circuit breaking) has been completely overtaken by service mesh technology.

Netflix Titus, Its Feisty Team, and Daemons.
Scaling a Distributed Stream Processor in a Containerized Environment
Odo.
Is Shared Database in Microservices actually anti-pattern?. hn
The Whys and Hows of Database Streaming
The Changing Face of ETL: Event-Driven Architectures for Data Engineers
Your migrations are bad, and you should feel bad. hn.
An introduction to distributed systems
Paying Technical Debt at Scale - Migrations @Stripe.
Transaction scripts https://dzone.com/articles/transaction-script-pattern https://stackoverflow.com/questions/16139941/transaction-script-is-antipattern https://gunnarpeipman.com/architecture-design-patterns/transaction-script-pattern/ https://learnbycode.wordpress.com/2015/04/12/the-business-logic-layer-transaction-script-pattern/ http://www.servicedesignpatterns.com/webserviceimplementationstyles/transactionscript http://lorenzo-dee.blogspot.com/2014/06/quantifying-domain-model-vs-transaction-script.html http://grahamberrisford.com/AM%202%20Methods%20support/06DesignPatternPairs/Domain%20Driven%20Design%20v.%20Transaction%20script.htm
Automating applications with @kubernetesio operators
Code in the database vs. code in the application. 2. 3. 4. 5. 6. 7 by Lukas Eder. tweet. 8. reddit. 9. 10. 11. mf.

The big myth perpetrated by architects who don’t really understand relational database architecture (me included early in my career) is that the more tables there are, the more complex the design will be.

feature flags at Twitter
Kubernetes commandments
Kafka running on OpenShift4 using Ceph Block Storage
What's next for Kubernetes.
12 Factors for Cloud Native and Openshift.
Openshift on Azure
domain probes hn
data preparation for predictive machine learning
spring high performance batch processing
Installing Openshift 4 from start to finish. Multiple stages within a Kubernetes cluster.
Migrating a Retail Monolith to Microservices: Sebastian Gauder at MicroXchg Berlin. slides.
microservices gone wrong
Idempotency - challenges and solutions over HTTP
reflections on moving to Kubernetes. advanced kubernetes. when to use kubernetes.
Bringing up an OpenShift playground in AWS
Should that be a microservice? hn.
deploy != release. Testing in Production, the safe way. Deploy != Release (Part 1). Deploy != Release (Part 2). Istio Observability with Go, gRPC, and Protocol Buffers-based Microservices. works in staging. Using Blue-Green Deployment to Reduce Downtime and Risk
. NoStaging. How to Deploy Software with Envoy. Reactive REST API Using Spring Boot and RxJava.
Mature Microservices and How to Operate Them.
Reconciling Kubernetes and PCI DSS for a Modern and Compliant Payment System.
become a better software architect
Eoin Woods on Democratising Software Architecture at ICSA 2019

Software architecture is still needed because stakeholders are still around, we need to decide on design tradeoffs and we have several cross cutting concerns in software. In practice, what happens nowadays is having more empowered cross-functional teams and using more lightweight descriptions for architecture than in the past. Difficult to understand and evolve architecture diagrams are now replaced by lightweight C4 and Architecture Decision Records diagrams. Code static and runtime analyses combined with informal documentation in the form of Wiki or powerpoint documents can substitute complex static documents. Tools like sonarqube for static code analysis or jaeger, zipkin, ELK, prometheus/grafana and NewRelic for distributed monitoring and tracing services in production can give an accurate and real time view of code and its architecture.

Architecture decision record
Drinking from the stream. slides.
Streaming IoT Data and MQTT Messages to Apache Kafka.
distributed tracing
DDD Ports and Adapters with Onion architecture, what goes where?

What is left inside a Hexagon is a logic to gather external data, call a decision maker and process result.

Layers, Onions, Ports, Adapters: it's all the same

I've put the UI components (the orange boxes) and the Data Access components (the blue boxes) in the same laye

DDD, Hexagonal, Onion, Clean, CQRS, … How I put it all together

the typical application flow goes from the code in the user interface, through the application core to the infrastructure code, back to the application core and finally deliver a response to the user interface.


while the CLI console and the web server are used to tell our application to do something, the database engine is told by our application to do something


The adapters that tell our application to do something are called Primary or Driving Adapters while the ones that are told by our application to do something are called Secondary or Driven Adapters.

Cockburn on Hexagonal Architecture

The ports and adapters pattern is deliberately written pretending that all ports are fundamentally similar. That pretense is useful at the architectural level. In implementation, ports and adapters show up in two flavors, which I’ll call ‘’primary’’ and ‘’secondary’’, for soon-to-be-obvious reasons. They could be also called ‘’driving’’ adapters and ‘’driven’’ adapters.

good description of ports and adapters

Asymmetry: Configurable Dependency implementation is different for each side. In the driver side, the application doesn’t know about which adapter is driving it. But in the driven side, the application must know which driven adapter it must talk to.

Isn't this just a layered architecture with a different name and drawn differently?. Onion vs. N-Layered Architecture. hexagonal architecture with spring data. video.
The Evolution of Comcast’s Architecture Guild
Application Integration for Microservices Architectures: A Service Mesh Is Not an ESB
Craftconf architecture talk
Real-time Data Processing using Redis Streams and Apache Spark Structured Streaming
majestic modular monoliths
How Netflix Thinks of DevOps
How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh. tweet
cloud native apps
clean code, direct style

code with few possible control flow combinations, direct style (can always trace what connects to what), doesn’t violate grep test, comments explain why.

FRP can help increase code cohesion
Lessons Learned Replacing a DI Framework in a Legacy Codebase
DI composition root. What is a composition root in the context of Dependency Injection. more. more. more. more. more. Clean Composition Roots with Pure Dependency Injection (DI). Are We Just Moving the Coupling?
Microservices, Apache Kafka, and domain-driven design
91 global variables in Excel that were protected by one spin lock. No one could unravel the hairball
A Service Mesh Is Not an ESB

A service mesh is only meant to be used as infrastructure for communicating between services, and developers should notbe building any business logic inside the service mesh.

Restoring Confidence in Microservices: Tracing That's More Than Traces
The Potential for Using a Service Mesh for Event-Driven Messaging
Cloud Functions
Aggregating REST and Real-Time Data Sources
Maintainable ETLs. hn
USE THE MOST PRODUCTIVE STACK YOU CAN GET. JAVA'S JOB LISTINGS, JWT, KAFKA, SERVERLESS, STREAMING, JARS IN WARS, THREADS, CODE COVERAGE--63RD AIRHACKS.TV
microfrontends
cloud transactions
the state of Java relational persistence. slides. Spring Data JPA from 0-100 in 60 Minutes
TRANSACTIONS, J2EE, JAVA EE, JAKARTA EE, MICROPROFILE AND QUARKUS
temporal modelling
regarding bad internal technology. HN.
Fast key-value stores: An idea whose time has come and gone. hn
Getting value out of your monad
some best (?) practices
the Challenges of Operationalizing Microservices
Mistakes we made adopting event sourcing

And if you store events with both an event_timestamp and effective_timestamp, you get bi-temporal state for free too.
Invaluable when handing a time series of financial events subject to adjustments and corrections. For instance, backdate interest adjustments due to misbooked payments, recalculate a derivatives trade if reported market data was initially incorrect, calculate adjustments to your business end of month P&L after correcting errors from two months ago.

as time goes by - technical challenges of bi-temporal Event Sourcing. same talk. event sourcing with bi-temporal data
The evolution of the Shopify codebase
forging a functional enterprise
Feature Flags and Test-Driven Design: Some Practical Tips
Design Techniques for Building #Streaming Data, Cloud-Native Applications
lots of lambdas. tweet.
How to get along with HATEOAS
Von Service-orientierten Architekturen (SOA) zu DDD und Microservices
When we try to force a service decomposition that isn't really there, we freeze today's technical design into an organizational design
Updating Materialized Views and Caches Using Kafka
the good parts of aws
Event-sourcing at Nordstrom: Part 2
apache kafka tutorials
TEMPORAL MODELLING
envers vs. debezium
Software Architecture Guide
How To Keep the Layers of your Spring App Separate using Integration Tests

MySQL CDC with Apache Kafka and Debezium
Message transformations for change data capture
Modern applications at AWS
Dependency Management and Versioning With a Maven Multi-Module Project
From PHP to transactions - airhacks
perhaps not a good idea
what microservices are
performance matters
Package by feature or by layer
hexagonal architecture in practice more more
intimidated by the sheer breadth of #DDD
DDDTrouble
"Building audit logs with change data capture and stream processing"
you can't have a rollback button. Rolling Forward and other Deployment Myths
Debezium resources
software archeology
CDC, @debezium Streaming and  @apachekafka an http://airhacks.fm episode
How many storage devices does a workload require?
Battle of the circuit breakers
Streaming Database Changes with Debezium by Gunnar Morling slides
CQRS
Kafka Streams: Topology and Optimizations
Build your own X
How to sleep at night having a cloud service: common Architecture Do's
"Stop Mapping Stuff in Your Middleware" Logic in the database vs. logic in the application
The dark side of events. Finding your service boundaries. Monolith Decomposition Patterns. the usefulness of pre-allocating ids at the beginning. Event-Driven Microservices, the Sense, the Non-sense and a Way Forward
Kafka stream workshop and slides.
The Configuration Complexity Curse – Don’t Be a YAML Engineer
Step Away From The Database - A step-by-step example of how to introduce Hazelcast into an existing database backed application.
auto-formatting @java source code as part of the build process is a blessing
Have you built applications following #DDDesign principles, using #JPA for persistence?
To Domain Driven Design
Vertical Slices. Out with the Onion, in with Vertical Slices. The Importance of Vertically Slicing Architecture. Vertical Slice Architecture. APLICA VERTICAL SLICE. Why vertical slice architecture is better. Our architecture is a mess! Are you sure?
Ensuring rollback safety during deployments. Dealing with safely rolling forward and rolling back stateful services isn't something people talk about much, if at all. It's the sort of thing that gets hand-waved away.
Java Cloud Native Starter or Kubernetes, OpenShift, istio, Postgres, Clouds, Backend for Frontend, vue.js and MicroProfile
To DTO or not to DTO
Azure for AWS specialists
Our setup of Prometheus and Grafana (as of the end of 2019)
A Thought Experiment: Using the ECS Pattern Outside of Game Engines. cache-friendliness
CSRF, XSS, JWT, REACTIVE DATABASES, TX AND WEBSOCKETS, JSON-B, OPENSHIFT
Practical Change Data Streaming Use Cases with Apache Kafka & Debezium
Qualities of a Highly Effective Architect
Plumbing At Scale
END-TO-END ARGUMENTS IN SYSTEM DESIGN
How I write backends. hn
The Many Faces of Modularity
3 database architectures
Modular Monolithic Architecture
Monoliths are the future
2020 Predictions
The Let It Crash Philosophy Outside Erlang
Scaling to 100k Users. complexity
Data Modernization for Spring-based Microservices
Why did disabling hyperthreading make my server slower?
Modularity does not have to be fancy. It could be as simple as using DDD and intelligent package naming
Testing Microservices: an Overview of 12 Useful Techniques - Part 1
Data-oriented architecture
git-flow vs GitHub flow

This is not the class of software that I had in mind when I wrote the blog post 10 years ago. If your team is doing continuous delivery of software, I would suggest to adopt a much simpler workflow (like GitHub flow) instead of trying to shoehorn git-flow into your team.


If, however, you are building software that is explicitly versioned, or if you need to support multiple versions of your software in the wild, then git-flow may still be as good of a fit to your team as it has been to people in the last 10 years. In that case, please read on.


Branching is a core concept in Git, and the entire GitHub flow is based upon it. There's only one rule: anything in the master branch is always deployable.

Ready for changes with Hexagonal Architecture | Netflix Tech Blog
Re-architecting 2-tier to 3-tier
Builders, fluent builders
Your database as an API
becoming an architect
GOTO 19  "Good Enough" Architecture .  Monolith Decomposition Patterns • Sam Newman  Building Resilient Frontend Architecture • Monica Lent
Systems design for Advanced Beginners
 humble guide to database schema design 
"API-first"
Beyond the Distributed Monolith
Should services always return DTOs, or can they also return domain models?. Service layer returns DTO to controller but need it to return model for other services. Pass DTO to service layer. Map DTOs and Models within the Service Layer due to a missing Business Layer. Entity To DTO Conversion for a Spring REST API. LocalDTO (2004)

Not only you could pass DTO objects to Service Layer, but you should pass DTO objects instead of Business Entities to Service Layer.


Your service should receive DTOs, map them to business entities and send them to the repository. It should also retrieve business entities from the repository, map them to DTOs and return the DTOs as reponses. So your business entities never get out from the business layer, only the DTOs do.


Some people argue for them as part of a Service Layer API because they ensure that service layer clients aren't dependent upon an underlying Domain Model. While that may be handy, I don't think it's worth the cost of all of that data mapping. As my contributor Randy Stafford says in P of EAA "Don't underestimate the cost of [using DTOs].... It's significant, and it's painful - perhaps second only to the cost and pain of object-relational mapping".


Let's now look at a service level operation – which will obviously work with the Entity (not the DTO)


@GetMapping
@ResponseBody
public List getPosts(...) {
//...
List posts = postService.getPostsList(page, size, sortDir, sort);
return posts.stream()
.map(this::convertToDto)
.collect(Collectors.toList());
}

service layer (old)

Enterprise applications typically require different kinds of interfaces to the data they store and the logic they implement: data loaders, user interfaces, integration gateways, and others. Despite their different purposes, these interfaces often need common interactions with the application to access and manipulate its data and invoke its business logic. The interactions may be complex, involv-ing transactions across multiple resources and the coordination of several responses to an action. Encoding the logic of the interactions separately in each interface causes a lot of duplication.

Presentation Model (old)
[ON DTOS (old)] much conflicting information!

DTOs are only created when their structure significantly differs from the that of the entity. In all other cases the entity itself is used. The cases when you don’t want to show some fields (especially when exposing via web services to 3rd parties) exist, but are not that common. This can sometimes be handled via the serialization mechanism – mark them as @JsonIgnore or @XmlTransient for example


Don’t use the mappers/entity-to-dto constructors in controllers, use them in the service layer. The reason DTOs are used in the first place is that entities may be ORM-bound, and they may not valid outside a session (i.e. outside the service layer).

performance of Java mapping frameworks

Creating large Java applications composed of multiple layers require using multiple models such as persistence model, domain model or so-called DTOs. Using multiple models for different application layers will require us to provide a way of mapping between beans.


Dozer is a mapping framework that uses recursion to copy data from one object to another.  The framework is able not only to copy properties between the beans, but it can also automatically convert between different types.

sooo subsetting and type conversions are an important part of mapping frameworks?
to DTO or not to DTO
DTO : Hipster Ou Dépassé ?
The magic behind the Dependency Injection of Quarkus
Java & SQL - stronger together overview of the lasagna
Domain Events Versus Change Data Capture
From Batch to Streaming to Both. Internet of Tomatoes: Building a Scalable Cloud Architecture.
newbie architect
Software Architecture for Agile Enterprises
One thing I never liked about ORMs and OGMs: Let the application define the database scheme, indexes or constraints.
Haskell arch
Domain event
Event-Driven Architectures for Spring Developers
Adoption of Cloud Native Architecture, Part 2: Stabilization Gaps and Anti-Patterns
Things I Wish I’d Known About CSS
Dividing front end from back end is an antipattern (?)
Domain-Oriented Microservice Architecture
For any event-based system, the message structures it exposes to external consumers are its public interface. Evolve their schemas with the same care and attention to backwards compatibility as your synchronous APIs.. cdc breaks encapsulation
How to design a REST API that can “prompt” the client about long-running operations?
Building dashboards for operational visibility
Inside the Hidden World of Legacy IT Systems.

I've tackled legacy systems my entire career, and there is a certain art to untying the knot of dependencies, procedures, and expectations.


It’s painful to see the people who know all the hotkeys and key sequences on an old green terminal suddenly thrust into a world of mouse hunting and clicking. It makes you wonder if new things are really better.

HLint evolution
Under Deconstruction: The State of Shopify’s Monolith
To Microservices and Back Again very good.
Design Microservice Architectures the Right Way (2018)
Event Sourcing You are doing it wrong (2018) See the papers mentioned at the end: the dark side of event sourcing. versioning in an event sourced system.
Moving BBC Online to the cloud
Go in Production – Lessons Learned
Asynchronous Task Scheduling at Dropbox
If you have the opportunity, please do not build it like this. Referring to the architectural diagram, it is going to be much more efficient for the "Frontend" to persist the task data into a durable data store, like they show, but then the Frontend should simply directly call the "Store Consumer" with the task data in an RPC payload. There is no reason in the main execution path why the store consumers should ever need to read from the database, because almost all tasks can cut-through immediately and be retired. Reading from the database should only need to happen due to restarts and retries of tasks that fail to cut through.
Terrible Source code

One of the best, and first, things we did when starting our machine learning platform was to design it using a plugin architecture. There's a lot of scar tissue and horrible experience through our previous ML products we built for enterprise.
Namely, it was extremely hard to onboard new developers to work on the product. They had to understand the whole thing in order to contribute.

Not Just Events: Developing Asynchronous Microservices. Creating event-driven microservices: the why, how and what.
Haskell app architecture.
Stored Procedures as a Back End
sagas - Azure reference architectures. sagas for consistency. 2. Not Just Events: Developing Asynchronous Microservices. Battle-tested event-driven patterns for your Microservices archit. Opportunities and Pitfalls of Event-driven Utopia

Clean Architecture Boundaries with Spring Boot and ArchUnit
Clean Architecture with Spring by Tom Hombergs
If All You Have Is a Database, Everything Looks Like a Nail. HN.

Soon, there was an established trend that increased the entropy and intertwining of applications and tables.  It became common to have transactional updates across tables for different apps.


Sometimes people stage read-only copies of tables. These are asynchronously updated from the authoritative owning application. Other applications then “own” the read-only copy in their application set of tables.


is only good advice if the tables are application specific data and you don't do microservices in that stupid braindead way that makes it so that everything from the admin panel to data visualization are their own "applications" with their own databases and doing things that would be even the simplest of queries becomes a project in writing what are effectively bad performance joins via random http APIs. IE have a data model and understand where the most painless boundaries are, don't throw up dozens of DBs for the hell of it.


Look into the patterns of CQRS, event sourcing, flow based programming and materialized views. GraphQL is an interface layer, but you still have to solve for the layer below. API composition only works when the network boundary and services are performance compatible to federate queries. The patterns above can be used to work around the performance concern at a cost of system complexity.


Don't forget the part where the queries are impossible to test, because you can't spin up real instances of all 15 APIs in a test environment, so all the HTTP calls are mocked and the responses are meaningless!


A lot of posters here seem to have been deeply burned from microservices designed along the wrong lines. I mean, sure, it happens. You're going to make mistakes just like you can misjudge how to separate concerns in a set of classes. It shouldn't be an issue to fix it. Maybe some teams focus on pure separation before they have a solid design? Maybe its just a culture of feeling like service boundaries can never change?


There’s a model of software based around shipping events around, and subscriptions between systems. The purposes of separation are at least a couple important, perhaps you know. Each has a DB, often embedded, that is suitable and materialized from the subscriptions and its own; mutated predictably.

Software Design for Flexibility book.
Engineers who participated in originally building a system are often magnitudes faster in fixing bugs and building features that engineers that joined later.
logs

But in practice, the accumulation of cold data on a local disk is where this starts to hurt, particularly if that has to serve read traffic which starts from the beginning of time (i.e your queries don't start with a timestamp range).


KSQL transforms does help reduce the depth of the traversal, by building flatter versions of the data set, but you need to repartition the same data on every lookup key you want - so if you had a video game log trace, you'd need multiple materializations for (user) , (user,game), (game) etc.


Write a event recording a desire to checkout. 2) Build a view of checkout decisions, which compares requests against inventory levels and produces checkout results. This is a stateful stream/stream join. 3) Read out the checkout decision to respond to the user, or send them an email, or whatever.


CDC is great and all, too, but there are architectures where ^ makes more sense than sticking a database in front.


Admittedly working up highly available, stateful stream-stream joins which aren't challenging to operate in production is... hard, but getting better.

Unpopular opinion: SQL is better than GraphQL. some good aspects. better for trees and DAGs. level limitation. what about using views?. another (older) comparative

GraphQL is better when what you are requesting is best expressed as a tree (or a "graph", though only the DAG variety). This is not always the case, but it very often is when building API:s for production use.


Of course, you can express tree structures in table form, but it is not very convenient for clients to consume. In particular if your client is rendering nested component views, what you want is very often something hierarchical.


performance is more predictable, exactly because the language is more restricted. You can't just join in all the things,  or select billions of rows by accident. The schema dictates what is allowed.


you can't request a tree structure (eg: a menu with submenus) with an unknown number of levels.


You don't have to expose your entire schema, instead expose carefully designed SQL views (so you can refactor your tables without breaking your API)

Lessons Learned from Reviewing 150 Infrastructures
sharing transactions and persistence contexts across module boundaries -- yea or nay?
Data architecture vs backend architecture
are queues overkill?
Bit little guide to message queues.
How we rebuilt the Walmart Autocomplete Backend
good checklist
React created roadblocks in our enterprise app. original link. Using react in enterprise contexts. The "seams" link in that last one is interesting as well.
Software engineering topics I changed my mind on
Architecture.md. ADR.
reworking of GHC's errors nice architectural choice to avoid cyclic dependencies
The complexity that lives in the GUI
Why microservices: part 5
	The Database Inside Your Codebase 
You probably don’t need a micro-frontend
Developing microservices with aggregates
Modules, monoliths, and microservices. hn.
	Why isn't Godot an ECS-based game engine? . lobsters.
testing quarkus
bbc and serverless
Capturing Every Change From Shopify’s Sharded Monolith
Backpressure in Reactive Systems
necessarily microservices but something akin to serverless functions running on a managed platform
In praise of --dry-run
Software Architecture Design for Busy Developers
kafka
The pedantic checklist for changing your data model in a web application
database migrations and continuous delivery
	Zero-downtime schema migrations in Postgres using views 
	Notes on streaming large API responses 
don't forget structure and then try to remember it
Microservices and Cross-Cutting Concerns
Qualities of a Highly Effective Architect
events, not webhooks
The Database Ruins All Good Ideas
Thinking in Events: From Databases to Distributed Collaboration Software
On the Evilness of Feature Branching
Changes tend to be made higher up in the stack, ultimately the UI, because that has a lower risk of breaking something else. This gets very messy very fast.
How much business logic should be allowed to exist in the controller layer?. How accurate is “Business logic should be in a service, not in a model”?. Why put the business logic in the model?.
requirements

Soliciting requirements is a iterative process, starting at an abstract level and diving down as you iterate. It is a data pull from the stakeholders; so it is about asking a ton of questions, several different ways, and becoming more tactical as you go along.

Solving the double (quintuple) declaration Problem in GraphQL Applications
Domain services (2012). Services in Domain-Driven Design (DDD).

application services which act as a facade. Application services are simple classes which have methods corresponding to use cases in your domain


When a significant process or transformation in the domain is not a natural responsibility of an ENTITY or VALUE OBJECT, add an operation to the model as standalone interface declared as a SERVICE. Define the interface in terms of the language of the model and make sure the operation name is part of the UBIQUITOUS LANGUAGE. Make the SERVICE stateless.

Retry long-running message processing in case of processing node failure.
Are Repositories implementations part of my domain? Should repositories have SQL queries?
DDD repositories in application or domain service

  
## _costs.md

      
    Raw
  

              _costs.md
            
          
    What does Unsplash cost in 2019?
Detect potential AWS costs savings
AWS Cost Management
Best Practices Design Patterns: Optimizing Amazon S3 Performance
Automatic Feedback-Directed Optimization for Warehouse-Scale Applications . tweet
Announcing the new pricing plan for AWS Config rules
AWS costs every programmer should know. hn.
reducing computing costs
Hakuna Cloud – Stop cloud servers when they are not in use
BigQuery best practices
understanding data transfer in AWS
AWS facts
multi-cloud
Bank of America's CEO says it's saved $2B per year by building its own cloud
Prevent Unnecessary Expense from Amazon Web Service (AWS) by Demystifying Its Cost Structure
unbundling AWS

Would add that edge compute, running cloud paradigms (code instead of config; automation; management abstractions), partially addresses these limitations for many use cases.


Costly to move the data off, but longer-term ROI for those orgs that are willing to make long-term decisions.


Meanwhile, as edge matures, greenfield apps should be edge-centric, rather than cloud-centric (doesn't mean they won't have cloud components...they will do the processing and storage where it best makes sense).


Once you have significant data on AWS it costs you so much to transfer it you are stuck with them. Their data fees are insane, and so are their storage fees.

The ominous opacity of the AWS bill – a cautionary tale
Cloud bandwidth costs are a rip off
the Amazon premium
IT operation costs traps
How to compete with AWS
comparison of Cloud provider costs
We use Kubernetes and spot instances to reduce EC2 billing
the only type of API services I will ever use
It’s 3 times more expensive to send 1TB of data out of Amazon EC2 than it is to buy a 1TB drive from Amazon.
AWS cost explorer
Hotel California for your data
Mastering AWS Cost Optimization: Real-world technical and operational cost-saving best practices
How to burn the most money with a single click in Azure
spot instances
How we reduced our Google Maps API cost by 94%
princing calculator for AWS
why Zoom chose Oracle
validate your pricing name
oops
free to send, costly to retrieve
costs saver
A Developer’s Guide to Cloud Costs
AWS budget actions
Is a billion dollars worth of server lying on the ground?. reddit.

If you need a predefined small number of VMs and no other functionality, it would be silly to go with AWS. But on the other hand, if you want a set of servers of a given class spawning on demand, with traffic coming in via load balancers, with integrated certificate and DNS management, with programmable lifecycle hooks, with integrated authn/authz, with full audit logs and account management, with configurable private networking to other services, etc. etc. ... You'll pay more than the price difference for someone to implement all of that from scratch.

5 IT Operations Cost Traps and How to Avoid Them
Taking Control of Confusing Cloud Costs
The Various Billing Philosophies of AWS
create an estimation
GCP Billing Budgets that send Pub/Sub notifications to Functions
Please fix the AWS free tier before somebody gets hurt
huge bills while learning
cloud cost podcast
a black hole of unpredictable spend, according to new report
 Is the unit of compute a "machine" or is it a millisecond of CPU and a GB of memory?

  
## css.md

      
    Raw
  

              css.md
            
          
    CSS-tricks
Three-sided border
CSS z-index Property

Note: z-index only works on positioned elements (position:absolute, position:relative, or position:fixed).

CSS Grid in IE: Debunking Common IE Grid Misconceptions
#5: Columns of Equal Height: Super Simple Two Column Layout
https://stackoverflow.com/questions/3298746/apply-different-css-stylesheet-for-different-parts-of-the-same-web-page
Combining multiple CSS files without conflicting code?
Guidelines for better and faster CSS
Guidelines for Brutalist Web Design  HN
css-nesting request to pick up the css-nesting proposal
The problem with CSS pre-processors What Will Save Us from the Dark Side of CSS Pre-Processors? WHY I'M (STILL) AGAINST SASS & LESS
Modern CSS Explained For Dinosaurs
CSS Utility Classes and "Separation of Concerns"
You might not need a CSS framework
Bootstrap
Transclusion in self-contained systems

At first, that sounds obvious, especially when you pretend that styles and scripts are isolated. Unfortunately they aren’t. However, if you manually provide for the highest possible isolation, for example by preventing collisions of CSS selectors (e.g. using system-specific HTML class prefixes), you can come to grips with this problem.

css isolation - How To Isolate a div from public CSS styles? - CSS isolation: there has got to be a better way - Sandbox local HTML/CSS code snippets inside an iframe (for style guides/pattern libraries) - How to isolate CSS styles to one area?

Some JavaScript libraries provide a noConflict mode. The technique is very simple: when the library initially loads, it keeps a copy of the global variable sat where it wants to live. When noConflict is called, the library puts the old global variable back where it was.

Restrict CSS applying on a particular div Reset/remove CSS styles for element only
A Vision for Our Sass
Basic concepts of flexbox

When we describe flexbox as being one dimensional we are describing the fact that flexbox deals with layout in one dimension at a time — either as a row or as a column. This can be contrasted with the two-dimensional model of CSS Grid Layout, which controls columns and rows together.

writing-mode

The writing-mode CSS property defines whether lines of text are laid out horizontally or vertically, as well as the direction in which blocks progress.

Introduction to the CSS basic box model
Layout and the containing block
CSS Border-Image
My Favorite Ways of Centering With CSS
Bootstrap 4.1.2 released hn
9 CSS in JS Libraries You Should Know in 2018
 Layoutit – An interactive CSS Grid generator HN
Automatically remove unused css from Bootstrap or other frameworks
Constructing Modern UIs with SVG - Tim G. Thomas
USING CSS GRID WHERE APPROPRIATE
https://www.mikecr.it/ramblings/functional-css/
https://news.ycombinator.com/item?id=18083508
https://lobste.rs/s/fqkodg/defense_functional_css
https://tailwindcss.com/docs/what-is-tailwind/
https://adamwathan.me/css-utility-classes-and-separation-of-concerns/
https://www.reddit.com/r/programming/comments/9j4cab/in_defense_of_functional_css_mike/
https://www.reddit.com/r/css/comments/9imhlv/its_2018_you_shouldnt_be_writing_vanilla_css/
https://itnext.io/what-is-modular-css-659949e23534
http://getbem.com/introduction/
Tailwind: A Utility-First CSS Framework.
CSS Layout cookbook. HN.
Incomplete List of Mistakes in the Design of CSS. hn.
Difference between justify-content vs align-items?. What the flex is the difference between justify-content, align-items, and align-content?!. A Quick Way to Remember the Difference Between justify-content and align-items. Demystifying CSS alignment. justify-items. aligning items in a flex container. reddit. CSS justify-content Property.
Centering in CSS: A Complete Guide.
Keeping CSS short with currentColor.
Difference between justify-content vs align-items?.
The Lowdown on :before and :after in CSS.
absolute vs. relative positioning. more.
You're using  wrong. lobsters.

Electron and the Decline of Native Apps.
When To Use The Button Element
animated grid layout
Why is CSS so damn hard?
Houdini and the Paint API
the state of CSS
things about css
Digging Into The Display Property: The Two Values Of Display
css grid & tables
SVG Properties and CSS
the state of css 2019
every layout
CSS Houdini & The Future of Styling by Una Kravets
How to Make Your Website Not Ugly - basic UX for programmers
https://medium.com/@wendersyang/what-the-flex-is-the-difference-between-justify-content-align-items-and-align-content-5fd3694f5259
https://stackoverflow.com/questions/35049262/difference-between-justify-content-vs-align-items
https://css-tricks.com/almanac/properties/a/align-items/
https://developer.mozilla.org/es/docs/Web/CSS/align-items
https://www.youtube.com/watch?v=GsSk9zv19AE How To Overlay One Div Over Another Div Using CSS

The z-index will only change stacking order, not the x & y positioning.
Relative positioning can be used to offset elements in x y space and create an overlap. But you may have to think carefully how you apply the offsets to keep responsive.

https://css-tricks.com/absolute-relative-fixed-positioining-how-do-they-differ/
https://dzone.com/articles/css-position-relative-vs-position-absolute
https://developer.mozilla.org/en-US/docs/Web/CSS/position
https://stackoverflow.com/questions/2027657/overlapping-elements-in-css
frontend design, react, and a bridge over the great divide
The CSS background-image property as an anti-pattern. HN
top 10 css mistakes
This Ain’t Disney: A practical guide to CSS transitions and animations. HN
Resilient CSS: 7-part Series. tweet
The Differing Perspectives on CSS-in-JS
Layout Land. resilient CSS
Bert Bos & Håkon Wium Lie | CSS Reset | CSS Day 2017
Pseudo-classes
The progression of CSS layouts
In Search of the Holy Grail (2006) old, obsolete.
5 ways to vertically center with CSS
Using whitespace to make our designs look better
https://uxmovement.com/buttons/the-myths-of-color-contrast-accessibility/
https://twitter.com/tailwindcss
In Defense of Utility-First CSS hn CSS Utility Classes and "Separation of Concerns" hn
Coping with Flexbox
7 Uses for CSS Custom Properties
flexbox woes
Old CSS, New CSS
Using CSS custom properties (variables)
intrinsic sizing in CSS
resilient CSS
Layout-isolated component
specifity in css - a rebuttal
Facebook's CSS-in-JS Approach. CSS Containment Now a Web Standard.
don't design for mobile
render-blocking JavaScript and CSS
display: 'flex', justifyContent: 'center'

  
## jee.md

      
    Raw
  

              jee.md
            
          
    Java 9 on Java EE 8 Using Eclipse and Open Liberty
JAVA EE 8 ON JAVA 9 - FROM INSTALL TO DEPLOYMENT WITH OPENLIBERTY SERVER
Build your own open source Eclipse MicroProfile
Graceful Shutdown Spring Boot Applications
How to bootstrap JPA programmatically without the persistence.xml configuration file
Difference Between BeanFactory and ApplicationContext in Spring
Build your own open source Eclipse MicroProfile
The magic of Spring Data
Understanding Jakarta EE: “Modularity is key to faster release cycles”
Spring Boot – Best Practices
10 Spring Boot Security Best Practices
Deep Dive into JUnit 5 Extension Model
Ten Things You Can Do With GraalVM
Java Performance Puzzlers
Exploring Java 9: The Key Parts
Build a MySQL Spring Boot App Running on WildFly on an Azure VM
BORING ENTERPRISE JAVA
JWT and Scalability, JSON-B Configuration, Bulk Data and JAX-RS, EclipseLink, Hibernate and Schema Validation, designing distributed storage applications, Payara Dockerfile explanation twitter
Consumer Driven Contract with Spring Boot
JDBC in Java, Hibernate, and ORMs: The Ultimate Resource
Helidon
Building an offline app from scratch and with web standards only..
sts 4
kubernetes para desarrolladores java
airhacks
A NOTE ON DATA TRANSFER OBJECTS (DTO)S.
[frameworks for Java application development.
Mastering Spring framework 5, Part 2: Spring WebFlux.
Oracle Code One 2018.
Live-Coding Web Apps (PWAs)—Without Frameworks.
Guide to "Reactive" for Spring MVC Developers.
HOW TO STRUCTURE JAKARTA EE APPLICATIONS FOR PRODUCTIVITY WITHOUT BLOAT.
Designing the infrastructure persistence layer.
Can you have multiple transactions within one Hibernate Session?.

A hibernate session is more or less a database connection and a cache for database objects.


A Session is an inexpensive, non-threadsafe object that should be used once and then discarded for: a single request, a conversation or a single unit of work.

How do I get the connection inside of a Spring transaction?.

The transaction manager is completely orthogonal to data sources. Some transaction managers interact directly with data sources, some interact through an intermediate layer (eg, Hibernate), and some interact through services provided by the container (eg, JTA).

Class HibernateTransactionManager.

PlatformTransactionManager implementation for a single Hibernate SessionFactory. Binds a Hibernate Session from the specified factory to the thread, potentially allowing for one thread-bound Session per factory. SessionFactory.getCurrentSession() is required for Hibernate access code that needs to support this transaction handling mechanism, with the SessionFactory being configured with SpringSessionContext.


Note: To be able to register a DataSource's Connection for plain JDBC code, this instance needs to be aware of the DataSource (setDataSource(javax.sql.DataSource)). The given DataSource should obviously match the one used by the given SessionFactory.


This transaction manager is appropriate for applications that use a single Hibernate SessionFactory for transactional data access, but it also supports direct DataSource access within a transaction (i.e. plain JDBC code working with the same DataSource). This allows for mixing services which access Hibernate and services which use plain JDBC (without being aware of Hibernate)! Application code needs to stick to the same simple Connection lookup pattern as with DataSourceTransactionManager (i.e. DataSourceUtils.getConnection(javax.sql.DataSource) or going through a TransactionAwareDataSourceProxy).

Interface Session.

The main runtime interface between a Java application and Hibernate. This is the central API class abstracting the notion of a persistence service.


The lifecycle of a Session is bounded by the beginning and end of a logical transaction. (Long transactions might span several database transactions.)


It is not intended that implementors be threadsafe. Instead each thread/transaction should obtain its own instance from a SessionFactory.

Class DataSourceUtils.

Is aware of a corresponding Connection bound to the current thread, for example when using DataSourceTransactionManager. Will bind a Connection to the thread if transaction synchronization is active (e.g. if in a JTA transaction).

Hibernate commit() and flush().

flush() will synchronize your database with the current state of object/objects held in the memory but it does not commit the transaction. So, if you get any exception after flush() is called, then the transaction will be rolled back. You can synchronize your database with small chunks of data using flush() instead of committing a large data at once using commit() and face the risk of getting an Out Of Memory Exception.


commit() will make data stored in the database permanent. There is no way you can rollback your transaction once the commit() succeeds.


One common case for explicitly flushing is when you create a new persistent entity and you want it to have an artificial primary key generated and assigned to it, so that you can use it later on in the same transaction. In that case calling flush would result in your entity being given an id.


Commit(); Commit will make the database commit.When you have a persisted object and you change a value on it, it becomes dirty and hibernate needs to flush these changes to your persistence layer.So You should commit but it also ends the unit of work.transaction.commit()


It is usually not recommended to call flush explicitly unless it is necessary. Hibernate usually auto calls Flush at the end of the transaction and we should let it do it's work. Now, there are some cases where you might need to explicitly call flush where a second task depends upon the result of the first Persistence task, both being inside the same transaction.

Hibernate sessions and transaction management guidelines.

Hibernate sessions are not thread-safe. Not only does this mean you shouldn’t pass a Hibernate session into a new thread, it also means that because objects you load from a session can be called from (and call back to) their owning session, you must not share Hibernate-managed objects between threads. Once again, try to only pass object IDs, and load the object freshly from the new thread’s own session.


Spring’s transaction management places the Hibernate session in a ThreadLocal variable, accessed via the sessionFactory. All Confluence DAOs use that ThreadLocal. This means that when you create a new thread you no longer have access to the Hibernate session for that thread (a good thing, as above), and you are no longer part of your current transaction.

MicroProfile, the microservice programming model made for Istio.
Spring Boot in a Container.
From Jakarta EE over MicroProfile to Serverless: Interactive Onstage Hacking.
Full Stack Reactive Java con @ProjectReactor!. tweet.
2019 predictions.
Jakarta EE MicroProfile WebStandards, On Stage Hacking noslides by Adam Bien
JSON-P: REMOVING A SLOT FROM A JSONOBJECT WITH JSONPATCH.
Pagination and Sorting With Spring Data JPA
tweet

So. The correct way to fix someone else's Spring Boot setup is to just try random annotations until it works, right?

EXCEPTION HTTP STATUS MAPPING WITHOUT MAPPERS
i18n in Java 11, Spring Boot, and JavaScript
searching in a distributed world
OPTIMIZING FOR HUMANS, NOT MACHINES
jee
 the Spring Framework Early Days, Languages Post-Java, & Rethinking CI/CD
how fast is spring?
java in the 21 century
webmvc.fn
Spring tips - dinamic views
MULTIPLE CACHE CONFIGURATIONS WITH CAFFEINE AND SPRING BOOT

Caching is key for performance of nearly every application. Distributed caching is sometimes needed, but not always. In many cases a local cache would work just fine and there’s no need for the overhead and complexity of the distributed cache.

Using ConfigMaps to configure MicroProfile / Java EE 8 applications
A Quick Guide to Spring Boot Login Options
Reactive Transactions with Spring
@ComponentScan on a @Service class in #Spring actually works and contributes to the application context
Basic Concepts: @Bean and @Configuration
Build a Spring Boot App With Flyway and Postgres
Spring boot tips. video.
Event Driven Microservices with Axon and Spring Boot
the proxy fairy and the magic of Spring
spring boot internals
I DON'T YOUR DEPENDENCY INJECTION
correspondences
bootiful podcast hateoas
Installing Jenkins, creating S2I build, setting up a CD pipeline, building, deploying and testing a Java EE / Jakarta EE / MicroProfile service (twice) and configuring the readiness probe ...in 7 minutes
Spring Cloud Data Flow
Things that should not appear in Java code in 2019

(about no getters) Most libraries (Jackson, Spring, etc) have supported direct field access for a while.

What's new in Spring Framework 5.x
CODE SHRINKING TECHNIQUES WITH JAKARTA EE AND MICROPROFILE--DEVOXX
live refactoring session
polyglot microservice example using #helidon and #graalVM
web frameworks
From Spring Boot apps to functional Kotlin
From Spring Boot apps to functional Kotlin
Modernize and optimize Spring Boot applications
dynamic CDS archives
Understanding Low Latency JVM GCs - Jean-Philippe BEMPEL
The Lean, Mean... OpenJDK?
JAKARTA EE 8: LINKS AND RESOURCES
JAVA EE IS DEAD
The definite guide to Java agents
Writing controllers
CODE SHRINKING WITH QUARKUS AND PANACHE ORM
Live-Coding Web Apps (PWAs)—Without Frameworks
henandoah – ultra-low Pause Time Garbage Collector
How Quarkus brings imperative and reactive programming together
How to Get Productive with Spring Boot
Quarkus and CORS
My view is that a vast majority of applications are fine as monoliths
Well secured and documented REST API with Eclipse Microprofile and Quarkus
Configuring a Main Class in Spring Boot
Back to Shared Deployments
Autoconfigurations In-Depth
Part III: Read Entities - Jakarta EE CRUD API Tutorial
Full-Duplex Scalable Client-Server Communication with WebSockets and Spring Boot (Part I)
Best Performance Practices for Hibernate 5 and Spring Boot 2
create a simple  @SpringData #JPA application using  @intellijidea

  
## jenkins.md

      
    Raw
  

              jenkins.md
            
          
    Large-Scale Continuous Delivery at Netflix and Waze Using Spinnaker
30 Jenkins features and plugins you wished you had known about before
aking Jenkins Pipeline to the Extreme
Adopting DevOps? You Are Aiming at the Wrong Target!
Injecting Secrets into Jenkins Build Jobs

Storing Secrets 
Using credentials
Credentials Plugin
"these days I care about continuous delivery a lot more than technology choice or programming language".
30 Jenkins features and plugins you wished you had known about before!.
Jenkins is Getting Old. hn.
CI/CD for Kubernetes With Jenkins and Spinnaker (Part 2)
GitHub Actions now supports CI/CD, free for public repositories
Modern Continuous Delivery • Ken Mugrage
Introduction to GitOps with OpenShift
Bazel 2.0 Released
supply chain tools like Grafeas, Kritis, in-toto, Clair, Micro Scanner, TUF, and Notary
CI/CD for Databases talk
haskell ci cd

  
## k8s.md

      
    Raw
  

              k8s.md
            
          
    Kubernetes Identity Management: Authentication. Key Features to Consider When Evaluating an Enterprise Kubernetes Solution.
Rancher vs. OKD
Project Calico and the Challenge of Cloud Native Networking
Develop Hundreds of Kubernetes Services at Scale with Airbnb
Re-Imagining Virtualization with Kubernetes and KubeVirt – Part II
Create a nested virtual machine in a Microsoft Azure Linux VM
Using Kubernetes ConfigMap Resources for Dynamic Apps
kubernetes trends
Tutorial: Explore Istio’s Traffic Rules and Telemetry Capabilities.
Upgrading your Cluster with Zero Downtime

masters are updated first, nodes follow

API Gateways and Service Meshes: Opening the Door to Application Modernisation

A lot of the multi-platform/hybrid cloud questions really revolve around what/where your control plane is

Enable Dynatrace OneAgent in Istio service mesh

Istio is a service mesh that supports running distributed microservice architectures. It’s a prominent vehicle that typically runs in Kubernetes to control inter-pod and inter-service traffic from Kubernetes workloads. For this, Istio uses Kubernetes Mutating Admission Webhooks for automatically injecting a sidecar proxy into pods.

Kubernetes and OpenShift Networking Primer
Goodbye AWS: Rolling your own servers with Kubernetes, Part 1. hn
OpenShift 4
49189 Running Kubernetes in Production: A Million Ways to Crash Your Cluster | DevOpsCon 2018
Container Design Patterns for Kubernetes, Part 1
Creating an Effective Developer Experience for @kubernetesio and Cloud-native Apps
Deploying HA PostgreSQL on OpenShift using Portworx
Deploying Docker Containers using an AWS CodePipeline for DevOps
Deploying a Haskell application to AWS Elastic Beanstalk
Docker on AWS - what is a difference between Elastic Beanstalk and ECS?. EKS vs. ECS: orchestrating containers on AWS. ECS Vs. EKS Vs. Fargate.
This shouldn’t have to be said, but do not put your Kubernetes API Server on the public internet.
Docker data science pipeline
A Practical kubernetes Operator using Ansible — an example
The 10 Kubernetes commandments
KubeCon EU 2019 "Securing Cloud Native Communication: From End User to Service"
Expanding the Kubernetes Operator Community
Kubernetes storage on Digital Ocean
AWS woes
the basics of stateful applications in kubernetes
openshift course
The Gorilla Guide to Kubernetes in the Enterprise — Chapter 2:
Powering Flexible Payments in the Cloud with Kubernetes. Reconciling Kubernetes and PCI DSS for a Modern and Compliant Payment System.
How Kubernetes and Configuration Management works
Configuration Best Practices. Labels and Selectors. Using labels effectively.
Openshift 4.1
How Canary Deployments Work, Part 1: Kubernetes, Istio and Linkerd
Pod Evictions based on Taints/Tolerations
cool projects right now
Isolating Linux containers with SDN
Kubernetes basics: Learn how to drive first
How to navigate the Kubernetes learning curve
Migrating From Self-Managed Kubernetes to AWS EKS Using Terraform at Blue Matador
Kubernetes: Long Label Names and UX
Creating a Killer Database Architecture with Kubernetes + MariaDB
Kubernetes Design Principles: Understand the Why
Helm Chart Patterns [I]
Persistent Storage with Kubernetes in Production - Which Solution and Why?
introduction to Kubernetes secrets and configmaps
the ability to leverage a canary rollout of the various control planes. more. testing upgrades.
Kubernetes failure stories
Machine API in Openshift 4
testing and local dev can be tricky at times
rethinking best practices. more recent tweet.
Reddit thread about OKD 4.1
Learn Openshift with Minishift
Kubernetes on CentOS 7 with Firewalld
Name Resolution Issue Due To Cache Inconsistencies In CoreDNS.
AWS's Outposts
version 1.15
future of CRD - schemas
I question, though, whether circuit breaking/timeouts/retries should be externalized (deferred) to the network.
life of a packet through istio
reliable AWS services
Regions provide physical mapping to the real world that allow you to deal with latency, compliance, failure domains, and data locality.
uses of daemonsets
Maintaining big Kubernetes environments with factories
k8s versus openshift thorough comparison
life of a packet through istio
Argo

https://argoproj.github.io/
https://itnext.io/argo-workflow-engine-for-kubernetes-7ae81eda1cc5 https://jaxenter.com/argo-workflow-engine-kubernetes-151694.html https://blog.argoproj.io/introducing-argo-a-container-native-workflow-engine-for-kubernetes-55c0b4b76fac https://fission.io/workflows/ https://kubernetes.io/blog/2018/06/28/airflow-on-kubernetes-part-1-a-different-kind-of-operator/ https://medium.com/@doronsegal/workflow-using-argo-kubernetes-6b45ef3f1614 https://containerjournal.com/topics/container-ecosystems/camunda-brings-workflow-engine-to-kubernetes/  https://www.youtube.com/watch?v=oXPgX7G_eow https://eksworkshop.com/batch/ https://blog.kintohub.com/how-do-we-ditch-jenkins-for-argo-1c0b4df5dab0 Why did we ditch Jenkins for Argo? https://applatix.com/introducing-argo-container-native-workflow-engine-kubernetes/ https://dzone.com/articles/parallel-workflows-on-kubernetes https://workflowengine.io/blog/does-it-make-sense-to-build-your-own-workflow-engine/ https://medium.com/@arik.cohen/lets-make-workflow-engines-fun-again-73c4ad5eb428 list of engines, good resource https://github.com/meirwah/awesome-workflow-engines https://zenaton.com/features/workflow-engine/ https://dzone.com/articles/workflow-management-how-build https://kissflow.com/workflow/workflow-engine-business-rule-engine-difference/ https://blog.bernd-ruecker.com/architecture-options-to-run-a-workflow-engine-6c2419902d91 https://camunda.com/solutions/add-workflow-software/ https://www.infoq.com/news/2009/07/WFEngine/ (2009) https://www.quora.com/What-should-I-use-to-weigh-the-decision-to-use-a-workflow-engine-and-build-workflow-into-our-in-house-application-vs-using-a-third-party-workflow-tool-such-as-Pipefy pachyderm/pachyderm#3345 kubeflow/kubeflow#376 https://siliconangle.com/2018/11/07/kubeflow-shows-promise-standardizing-ai-devops-pipeline/ https://www.kubeflow.org/docs/use-cases/gitops-for-kubeflow/ https://github.com/kubeflow/pipelines https://medium.com/kubeflow/kubeflow-in-2018-a-year-in-perspective-49c273b490f4 https://www.youtube.com/watch?v=zVTNobgvR9M Kuberflow + Argo https://blog.argoproj.io/using-gitops-to-deploy-kubeflow-with-argo-cd-76f6b27807c https://news.ycombinator.com/item?id=18425084 Google's new Kubeflow Pipelines service uses Argo. https://www.speechmatics.com/2019/01/argo-learn-all-about-the-kubernetes-workflow-engine/ http://dev.matt.hillsdon.net/2018/03/24/argo-integration-review.html Airflow: the future of data engineering https://news.ycombinator.com/item?id=13761071 https://www.youtube.com/watch?v=oXPgX7G_eow https://www.youtube.com/watch?v=VrsVbuo4ENE Compare to Apache Airflow argoproj/argo-workflows#849 https://github.com/argoproj/data-pipeline https://medium.com/@doronsegal/workflow-using-argo-kubernetes-6b45ef3f1614 https://www.astronomer.io/blog/using-apache-airflow-to-create-data-infrastructure/ https://towardsdatascience.com/data-pipelines-luigi-airflow-everything-you-need-to-know-18dc741449b7 https://stackoverflow.com/questions/57037302/apache-airflow-or-argoproj-for-long-running-and-dags-tasks-on-kubernetes https://medium.com/@dieswaytoofast/kubernetes-workflow-with-argo-74b776b252c1 https://github.com/brigadecore/brigade https://admiralty.io/blog/running-argo-workflows-across-multiple-kubernetes-clusters/ https://azure.microsoft.com/es-es/services/kubernetes-service/ https://www.youtube.com/watch?v=M_rxPPLG8pU https://towardsdatascience.com/data-engineering-basics-of-apache-airflow-build-your-first-pipeline-eefecb7f1bb9 https://www.youtube.com/watch?v=pKLPXA-gnvw https://www.youtube.com/watch?v=6eNiCLanXJY https://www.youtube.com/watch?v=43wHwwZhJMo what is a pipeline and what is a workflow? https://bioinformatics.stackexchange.com/questions/7347/what-is-the-difference-between-a-bioinformatics-pipeline-and-workflow https://www.lightbend.com/blog/how-to-deploy-kubeflow-on-lightbend-platform-openshift-support-components-kubeflow argo in OpenShift https://www.lightbend.com/blog/how-to-deploy-kubeflow-on-lightbend-platform-openshift-support-components-kubeflow https://github.com/argoproj/argo-ui https://fission.io/workflows/ https://kubernetes.io/blog/2018/06/28/airflow-on-kubernetes-part-1-a-different-kind-of-operator/ a comparison https://xunnanxu.github.io/2018/04/13/Workflow-Processing-Engine-Overview-2018-Airflow-vs-Azkaban-vs-Conductor-vs-Oozie-vs-Amazon-Step-Functions/ https://www.youtube.com/watch?v=pKLPXA-gnvw https://www.youtube.com/watch?v=qUwz20v7lcc https://www.youtube.com/watch?v=yXkLuPaLPoE architecture decisions https://blog.bernd-ruecker.com/architecture-options-to-run-a-workflow-engine-6c2419902d91 another comparison https://xunnanxu.github.io/2018/04/13/Workflow-Processing-Engine-Overview-2018-Airflow-vs-Azkaban-vs-Conductor-vs-Oozie-vs-Amazon-Step-Functions/ Kickoff Argo workflows via REST call https://stackoverflow.com/questions/54912490/kickoff-argo-workflows-via-rest-call https://github.com/argoproj/argo/blob/master/docs/rest-api.md#examples https://www.paradigmadigital.com/dev/apache-airflow/ Azure solution? https://azure.microsoft.com/en-us/services/logic-apps/ https://notetoself.tech/2018/04/08/logic-apps-x-microsoft-flow-which-one-should-i-choose/ more logic apps https://thenewstack.io/serverless-and-workflows-the-present-and-the-future/ even more logic apps https://xo.xello.com.au/blog/why-use-logic-apps-integration-platform https://kubernetes.io/docs/reference/using-api/api-concepts/ https://stackoverflow.com/questions/tagged/argoproj?tab=Votes https://opendatahub.io/news/2019-04-29/project-road-map-for-2019.html the Ceph data lake https://opendatahub.io/ https://www.youtube.com/watch?v=STh3F2g2gsM  https://www.youtube.com/watch?v=eCGx8Y1qcmU  https://www.openstack.org/assets/presentation-media/QCT-Lightning-Talk-Building-Big-Data-Analytics-Data-Lake-with-All-flash-Ceph.pdf  https://devconfus2019.sched.com/event/RFDN/ml-pipelines-with-kubeflow-argo-and-open-data-hub The Open Data Hub (ODH) is a scalable data lake platform that provides tools such as distributed Spark and Ceph data store. https://bigdata.cioreview.com/cxoinsight/data-lake-building-a-bridge-between-technology-and-business-nid-24733-cid-15.html https://www.alibabacloud.com/help/doc-detail/119725.htm https://es.slideshare.net/inovex/data-science-und-machine-learning-im-kuberneteskosystem https://www.redhat.com/en/blog/why-spark-ceph-part-1-3 https://kubernetes.io/blog/2018/06/28/airflow-on-kubernetes-part-1-a-different-kind-of-operator/ https://go.qct.io/wp-content/uploads/2019/07/Real-time-Analytics-with-All-Flash-Ceph-Data-Lake-Architecture-EN_20180611.pdf Brigade https://brigade.sh/ https://cloudblogs.microsoft.com/opensource/2019/04/01/brigade-kubernetes-serverless-tutorial/ https://www.interline.io/blog/scaling-openstreetmap-data-workflows/ Azure blob storage artifact support
opinion on Argo. opinion on Airflow.
CI / CD pipelines now on digital ocean
Mistake that cost thousands (Kubernetes, GKE)
The fact that configmaps are not bound to a specific replicaset is one of Kubernetes' worst design decisions.
The shipwreck of GKE Cluster Upgrade
A Kubernetes/GKE mistake that cost me thousands of dollars
Kubernetes patterns - declarative deployments
Kubernetes from scratch to AWS with Terraform and Ansible (part 1)
uses of Ceph at CERN
k8s 1.16. hn.

"Managed Kubernetes" really runs the spectrum between "one step above just installing it yourself on a bunch of VMs" and "I spend 1% of my time managing anything below the product." Each cloud provider exists somewhere different on this spectrum, with none of them being in quite the same location, and some of them have multiple different products which exist at different points.


For example: AWS is among the most bare-bones. EKS is just a managed control plane; coming from GKE, you might click "create an cluster" then be very confused how there are no options for, say, instance size, or how many... because you have to do that all yourself. There are tools like eksctl or Rancher which can help with this, but ultimately, you're managing those instances. You're doing capacity planning (you think kube would be a great pick to integrate with spot fleets because of its ability to schedule and move workloads to a new instance when one goes down? have fun setting it up, hope you like ops work.). You're doing auto-scaling (and that ASG? its not going to know about your pod resource requests, so you either need some very smart manual coordination between the two, or you need to set up cluster-autoscaler). You're setting up cluster metrics (definitely need metrics-server. not heapster, that was last year, metrics-server is this year. but how to visualize? do i host grafana in the cluster? then i need to worry about authn. cloudwatch really isn't made for these kinds of things... maybe I'll just give datadog a few thousand bucks.) Crap, 1.16 is out already? They only support 9 months of releases with security updates?! I feel like I just upgraded my nodes! Oh well, time to lose a day replicating this update across all of my environments.


DigitalOcean is pretty similar to this (it does provision instances, but the tooling beyond that is barebones). Google Cloud/GKE is "more managed" in a few sense; the cloud dashboard provides some great management capabilities out-of-the-box, such that you may not need to reach for something like Datadog, and the autoscaler works really well without a lot of tinkering. There are still underlying instances, so you're worrying about ingress protection, OS hardening, OS upgrades, etc... but its not as bad as AWS. Not by a long shot.

Infrastructure Wars with Sheng Liang
Using a Kubernetes based Cluster for Various Services with auto HTTPS
Configuration management with Kubernetes
Write your own Kubernetes
Everything I know about Kubernetes I learned from a cluster of Raspberry Pis
The Focusing Illusion of Developer Productivity
Azure Functions Private Site Access
Java Application Optimization on Kubernetes on the Example of a Spring Boot Microservice
The Open Application Model from Alibaba's Perspective
Dhall for Kubernetes
Effective Management of APIs
Comparing Kubernetes CNI Providers: Flannel, Calico, Canal, and Weave
Understanding Modern Cloud Architecture on AWS
terraform modules
High-density Multi-tenant Bare-metal Cloud
AWS the main services
zero-downtime release
aws tagging best practices
good and bad monitoring
Managing the Risk of Cascading Failure
Using strongly-typed entity IDs to avoid primitive obsession. so answer. But how to implement this in JPA? Perhaps with a fake "composite" id class that only has one component? IdClass. How to create and handle composite primary key in JPA.

  
## linux.md

      
    Raw
  

              linux.md
            
          
    zwischenzugs
podman and buildah for Docker users
how modern Linux systems boot
How to Make Linux Microservice-Aware with Cilium and eBPF
[container networking]
https://unix.stackexchange.com/questions/405805/connecting-two-network-namespaces-via-a-veth-interface-pair-where-each-endpoint
https://matthewarcus.wordpress.com/2018/02/04/veth-devices-network-namespaces-and-open-vswitch/
https://gabhijit.github.io/linux-virtual-interfaces.html
https://superuser.com/questions/618968/how-can-i-list-which-virtual-ethernet-pairs-are-running-in-the-current-linux-hos
https://matthewarcus.wordpress.com/2018/02/04/veth-devices-network-namespaces-and-open-vswitch/
https://stackoverflow.com/questions/37536687/what-is-the-relation-between-docker0-and-eth0 <- very good
http://dockone.io/uploads/article/20150527/e84946a8e9df0ac6d109c35786ac4833.png
https://medium.com/@xiaopeng163/docker-bridge-networking-deep-dive-3e2e0549e8a0
https://platform9.com/blog/container-namespaces-deep-dive-container-networking/
https://serverfault.com/questions/832176/direct-veth-pair-vs-linux-bridge
https://wvi.cz/diyC/networking/
https://stackoverflow.com/questions/25641630/virtual-networking-devices-in-linux
http://www.i3s.unice.fr/~urvoy/docs/VICC/3_vicc.pdf
How to take back control of /etc/resolv.conf on Linux.
Self-Service Linux: Mastering the Art of Problem Determination.
use mmap with care

  
## maps.md

      
    Raw
  

              maps.md
            
          
    Openstreetmap

Why Use OpenStreetMap Instead of Google Maps? hn
https://operations.osmfoundation.org/policies/tiles/
https://wiki.openstreetmap.org/wiki/Slippy_Map

Slippy Map is, in general, a term referring to modern web maps which let you zoom and pan around (the map slips around when you drag the mouse).

https://www.mapbox.com/help/how-web-apps-work/
https://wiki.openstreetmap.org/wiki/Browsing
https://wiki.openstreetmap.org/wiki/Main_Page
https://wiki.openstreetmap.org/wiki/Tiles

square bitmap graphics displayed in a grid arrangement to show a map

https://switch2osm.org/
https://blog.openstreetmap.org/2018/06/20/switch2osm/

the switch2osm website, with up to date information on running your own OSM based services


Apart from very limited testing purposes, you should not use the tiles supplied
by OpenStreetMap.org itself. OpenStreetMap is a volunteer-run non-profit body
and cannot supply tiles for large-scale commercial use. Rather, you should use
a third party provider that makes tiles from OSM data, or generate your own.


Serving your own maps is a fairly intensive task. Depending on the size of the
area you’re interested in serving and the traffic you expect the system
requirements will vary. In general, requirements will range from 10-20GB of
storage, 4GB of memory, and a modern dual-core processor for a city-sized
region to 300GB+ of fast storage, 24GB of memory, and a quad-core processor for
the entire planet.


We would recommend that you begin with extracts of OpenStreetMap data – for
example, a city, county or small country – rather than spending a week
importing the whole world (planet.osm) and then having to restart because of a
configuration mistake!

Geofabrik
Mapnik
Meteogalicia

RSS, GeoRSS, Podcast e JSON

  
## other.md

      
    Raw
  

              other.md
            
          
    cargo.txt
Save RSS and Atom!
Solving JVM Performance Problems with Profilers: Wallclock vs CPU Time Edition
Modern SAT solvers: fast, neat and underused (part 1 of N) HN
The Ultimate JSON Library: JSON.simple vs GSON vs Jackson vs JSONP
The Usefulness of Abstracting Over Time
How to Improve the Performance of a Java Application
Host your blog on DigitalOcean with Docker, Nginx and Let’s Encrypt. hn.
automated formatting
gitops.
Kubernetes: The Surprisingly Affordable Platform for Personal Projects. lobsters.
SonarJS: Detect runtime exceptions in JavaScript
Jackson JSON Views.
Subtyping vs. Parametrization for a Complex Domain.
Introduction to Linux interfaces for virtual networking
Command-line links https://lobste.rs/s/5azafh/how_i_m_still_not_using_guis_2019_guide https://lucasfcosta.com/2019/02/10/terminal-guide-2019.html http://ballingt.com/rich-terminal-applications-2/ Richer command line interfaces https://lobste.rs/s/hqui1o/richer_command_line_interfaces https://lobste.rs/s/ycvcsw/hard_part_becoming_command_line_wizard https://www.johndcook.com/blog/2019/02/18/command-line-wizard/
kubernetes cluster networking
SAT / SMT by example
Kubernetes Borg/Omega history topic

3: Annotations. Borg's Job type had a single notes field. Like the DNS TXT record, that proved insufficient. For example, layers of client libraries and tools wanted to attach additional information.

Don’t read your data from a straw
podman instead of docker
Expressing Business Flows using an F# DSL
AWS costs
Gallery of Processor Cache Effects
f 4.6
modern SAT solvers are underused
implementing a draft mode for posts
Tracking the weather with Python and Prometheus
s3 & cloudflare
aws costs
kube complexity
Monitoring and Observability with USE and RED.
Introducing Traffic Director: Google's Service Mesh Control Plane.
PCI compliance faq. PCI. A guide to PCI compliance. pci requirements. hn. another story.
How I draw figures for my mathematical lecture notes using Inkscape
robust user interfaces with state machines
Streaming Java CompletableFutures in Completion Order.
On lists, cache, algorithms, and microarchitecture
I/O Is Faster Than CPU – Let’s Partition Resources and Eliminate OS Abstractions hn
Data Structures for Range Minimum Queries in Multidimensional Arrays. Segment Tree | Set 2 (Range Minimum Query). Assignment 1: Range Minimum Queries. Multidimensional segment trees can do range queries and updates in logarithmic time. A Simple Linear-Space Data Structure for
Constant-Time Range Minimum Query?. CP-algorithms - sparse table. segment trees.
Automated Refactoring of a U.S. Department of Defense Mainframe to AWS
SRE stuff. more. more.
hoverfly tutorial
Kotlin coroutines android
Self-Hosting Your Own Cloud – Part 2: SMB File Server with Automated Backups using Rsync/Rclone
kythe and semantic.
The configuration complexity clock. dhall. jsonnet. json not a good conf language. just use a programming language. safety garantees. At what point does a config file become a programming language?.
naming convention
Effective problem solving using SAT solvers

Once the problems have been encoded into Boolean logic, solutions can be found (or shown to not exist) automatically, without the need to implement any search algorithm.

The Trouble with Memory
Computer Architecture – ETH Zürich – Fall 2019
debugging stories. awesome LD_PRELOAD.
Hedgehog for state machine testing
The builder pattern https://blog.ploeh.dk/2017/08/21/generalised-test-data-builder/ Test Data Builders in C# https://blog.ploeh.dk/2017/08/15/test-data-builders-in-c/ https://blog.ploeh.dk/2020/02/10/builder-isomorphisms/

  
## peopleware.md

      
    Raw
  

              peopleware.md
            
          
    prioritizing secure development
614: Women at Work: Make Yourself Heard
Building an Inclusive Code Review Culture hn
code review best practices
communication techniques
Service Ownership @Slack
using the wrong dictionary
scaling software teams
Agile Teamentwicklung: So wird ihr Team agil!
tweet
nothing which doesn’t have a named owner gets done, and nothing with more than one named owner gets done.
tech lead
automate formatting
This Is How We Manage Projects on a Fully Remote Team
Mistakes and Discoveries While Cultivating Ownership
good ways to capture institutional knowledge?
Always Be Journaling
Git Best-Practice - Keeping a Diary
decision records
restructured text for docs
Too many “two-pizza” teams
Senior Engineers Build Consensus (2019)
If you want to transform IT, start with finance
bargain scope, not time

the key point is that the stakeholders just want something, anything and as soon as possible at that.


## react.md

      
    Raw
  

              react.md
            
          
    Roadmap to becoming a React developer in 2018 hn
Using Redux actions
Make your PWA work offline I
How to visually design state in JavaScript
Redux or ES6?
Offlinefähige Desktopanwendungen mit Angular und Electron
Redux vs. The React Context API
The Terrible Performance Cost of CORS Request on the Single-Page Application. hn.
removing jquery from GitHub. blog.
Creating a Drag-and-Drop File Uploader with React & TypeScript
reactive timed popup
Introducing Hooks. HN. tweet.
React Today and Tomorrow and 90% Cleaner React.

The React hooks proposal shows how to express several existing concepts that are currently bulky (React component declaration, state, context, lifecycle) using only functions

React Component Patterns by Michael Chan
Vue 3.0 Contains info about suing JSX without webpack. gist.

if you are referring to React when mentioning webpack / transliteration, i'd like to mention that you don't need them for React neither. It can be resumed to a simple script tag as well.


Could you paste the simple script tag that makes React work without transpilation? I always thought you needed Webpack (or an equivalent) to transpile JSX into a vanilla js function the browser can interpret.


No, I meant simply adding a tag and using react directly, in any context. That's how it was originally designed, btw. It wasn't meant only for SPAs when it was conceived, but as an addon to existing websites.
The other comment gives a perfect example and there are a few tutorials (although I do agree not very mainstream) that teach React without JSX/Webpack/Babel/etc.


There's also a babel script that you can drop into a script tag and will transpile stuff in the browser for you if you want JSX.
Not recommended for large projects in production (but then I'd be using webpack or similar for large vue projects too), but it works pretty well (and surprisingly fast) for quick experiments.

React.Component vs React.createClass.
React.createClass versus extends React.Component.

Two ways to do the same thing. Almost. React traditionally provided the React.createClass method to create component classes, and released a small syntax sugar update to allow for better use with ES6 modules by extends React.Component, which extends the Component class instead of calling createClass.


For the React changes, we now create a class called “Contacts” and extend from React.Component instead of accessing React.createClass directly, which uses less React boilerplate and more JavaScript. This is an important change to note further changes this syntax swap brings.

ECMAScript 6 modules: the final syntax.

The default export is actually just a named export with the special name default.


In current JavaScript module systems, you have to execute the code in order to find out what the imports and exports are. That is the main reason why ECMAScript 6 breaks with those systems: by building the module system into the language, you can syntactically enforce a static module structure. Let’s first examine what that means and then what benefits it brings.

Making Sense of React Hooks.
Why mixins are broken

If several components used this mixin to subscribe to a data source, a nice way to avoid repetition is to use a pattern called “higher-order components”. It can sound intimidating so we will take a closer look at how this pattern naturally emerges from the component model.

Mixins Are Dead. Long Live Composition.

render props, mixins, hocs...

5 common practices that you can stop doing in React.
How Does setState Know What to Do?. HN.
Why Do React Hooks Rely on Call Order?. HN.
decluttering a React application. reddit.
what is the shadow DOM?. reddit
Understanding JavaScript Modules As A TypeScript User.
Ask HN: Go-to web stack today?
Making SetInterval Declarative with React Hooks more
Spring Framework 5 (Boot/Cloud) + React?.
new es2018 features
hooks example
scheduling in react
useReducer
useEffect
Architecting UIs for Change
TypeScript for Enterprise Developers
react unpopular opinions
Dilemmas With React Hooks - Part 2: Persistence And Memoization. hn.
web components
biggest lies about react hooks
Comparing JVM alternatives to JavaScript. hn.
binding actions creators - doesn't make much sense with hooks
Deeply Understanding JavaScript Async and Await with Examples
hooks tip
React articles from Google
Typescript & React: Manipulating Prop Types
RxJS: A Better Way to Write Front-end Applications. RxJS behaviour subjects
Typescript 3.5
react from vue
pitfalls adopting react hooks
unnecessary rerenders
React testings vs. end-to-end testing
You Probably Don't Need Derived State
The modern PWA worksheet
React and PureScript
reusable componentes using React
adopting typescript at scale
improving your react with typescript ADTs
React + Redux + Typescript
smells in react apps
Fantastic Front-End Performance Tricks
pseudo-elements https://css-tricks.com/a-little-reminder-that-pseudo-elements-are-children-kinda/
A Simple, Understandable Production Ready Frontend Project Setup
Facebook GraphQL interview
State Management for React Using Context and Hooks
algebraic effects
theming with react and sass
Programming the Cloud with TypeScript
People keep asking if Hooks can replace Redux
Using React Hooks to Wrap Connectors to Live Data Sources
Loading States in React Components Using TypeScript’s Discriminated Unions
using typescript like a pro
react interview
testing react with jest and enzyme
using Typescript with React
JavaScript: The Modern Parts
The metaphysics of Javascript. Deconstructing Web frameworks for a more resilient code base
react hooks pitfalls
Using React Hooks & Context to Avoid Adding Redux Too Early
frustrations with React hooks
Using React Hooks & Context to Avoid Adding Redux Too Early
Thinking in React Hooks
Chart.js
aha moment with react hooks
Building a commentary sidebar in React
build your own React
react unit testing
React Table is a “headless” UI library
One of the many problems with this data fetching approach is that the cache is too local
useState with useReducer
thinking in react hooks
newbie confusion. difficult things
TS tricks for React
My browser does what?
classes vs hooks
styled components
The Many Jobs of JS Build Tools
testing react applications
advanced PWAs
CSS options poll
Replacing Redux with observables and React Hooks
Things I wish I knew about state management when I started writing React apps
SSR menagerie
How to Scale a React Component
Create dynamic reducers by passing values or functions in as an argument
Persisting React State in LocalStorage
a possible approach to leveraging remoteData in React with hooks and TypeScript
a possible approach to leveraging remoteData in React with hooks and TypeScript
es6 import for side effects meaning

Import an entire module for side effects only, without importing anything. This runs the module's global code, but doesn't actually import any values.

ES6 Module Gotchas

If you will have side-effects, separate them and load them in a module with short syntax.

ES modules: A cartoon deep-dive

The final step is filling in these boxes in memory. The JS engine does this by executing the top-level code — the code that is outside of functions.


Besides just filling in these boxes in memory, evaluating the code can also trigger side effects. For example, a module might make a call to a server.


This is one reason to have the module map. The module map caches the module by canonical URL so that there is only one module record for each module. That ensures each module is only executed once. Just as with instantiation, this is done as a depth first post-order traversal.

testing React apps
useReducer > useState
typescript without typescript
manage html dom with vanilla javascript
react mental models
method references & bind. Can you bind 'this' in an arrow function?.
the magic of static workflows
The Many Jobs of JS Build Tools
source maps from top to bottom
when does react re-render?
things to know about react
How We Reduced Our React App’s Load Time by 60%
React-query and swr

Should I cache data on the client for a certain period?
Should I load fresh data when the tab is refocused, or the network reconnects?
Should I retry failed HTTP calls?
Should I return cached data, then fetch fresh data behind the scenes?
Should I handle server cache separately from app state?
Should I avoid refetching recently fetched data?
Should I prefetch data the user is likely to want?

Common mistakes writing React components with hooks .
A React implementation of Spectrum, Adobe’s design system
render props are not dead Exploring Render Props Vs. React Hooks In 2020
. Using custom hooks in place of "render props"
modern forms in react
cancelling promises with hooks
https://reactjs.org/docs/hooks-state.html
https://juliangaramendy.dev/use-promise-subscription/
https://medium.com/@rajeshnaroth/writing-a-react-hook-to-cancel-promises-when-a-component-unmounts-526efabf251f
https://dev.to/rodw1995/cancel-your-promises-when-a-component-unmounts-gkl
https://www.reddit.com/r/reactjs/comments/blhj2b/how_do_i_cancelignore_previously_running_promises/
https://itnext.io/introduction-to-abortable-async-functions-for-react-with-hooks-768bc72c0a2b
https://codesandbox.io/s/useeffect-react-hooks-cancel-promise-h6dcw
Lemoncode/react-hooks-by-example#4
https://stackoverflow.com/questions/49906437/how-to-cancel-a-fetch-on-componentwillunmount
https://github.com/microsoft/PowerBI-JavaScript/wiki/Bootstrap-For-Better-Performance
https://github.com/microsoft/PowerBI-client-react#powerbi-client-react
https://react-query.tanstack.com/

  
## secrets.md

      
    Raw
  

              secrets.md
            
          
    Shamir's secret sharing hn
Commitment scheme
Coin flipping by telephone a protocol for solving impossible problems slides
Zero-knowledge proof software tweet
Create a VPN-Secured VPC With Packer and Terraform.
How to run database integration tests 20 times faster.

In-memory databases such as H2, HSQLDB, and Derby are great to speed up integration tests. Although most database queries can be run against these in-memory databases, many enterprise systems make use of complex native queries which can only be tested against an actual production-like relational database.


## spas.md

      
    Raw
  

              spas.md
            
          
    Escaping the SPA rabbit hole with modern Rails hn related video
If your CV only contains jQuery…
Vue.js: the good, the meh, and the ugly HN
the typescript tax
slugs

An API shall not return the internal id used within the database, but return a slug instead,

reinventing the browser

  
## testing.md

      
    Raw
  

              testing.md
            
          
    Stuff about testing persistence repositories
https://softwareengineering.stackexchange.com/questions/301479/are-database-integration-tests-bad
https://softwareengineering.stackexchange.com/questions/185326/why-do-i-need-unit-tests-for-testing-repository-methods
https://softwareengineering.stackexchange.com/questions/348661/should-i-use-a-layer-between-service-and-repository-for-a-clean-architecture-s
https://softwareengineering.stackexchange.com/questions/111193/hooking-up-a-business-layer-and-repository-using-unit-of-work-pattern?rq=1
https://softwareengineering.stackexchange.com/questions/294561/repository-pattern-with-service-layer-too-much-separation?rq=1
https://softwareengineering.stackexchange.com/questions/282033/how-do-you-scale-your-integration-testing/283067#283067
https://softwareengineering.stackexchange.com/a/283067/76774 fake repositories.
https://www.baeldung.com/spring-boot-testing
https://grokonez.com/testing/datajpatest-with-spring-boot

By default, @DataJpaTest will configure an in-memory embedded database, scan for @Entity classes and configure Spring Data JPA repositories. It is also transactional and rollback at the end of each test.

https://blog.philipphauer.de/dont-use-in-memory-databases-tests-h2/ https://www.baeldung.com/spring-testing-separate-data-source https://www.baeldung.com/spring-jpa-test-in-memory-database https://medium.com/@joeclever/integration-testing-multiple-datasources-in-spring-boot-and-spring-data-with-spock-f88e1428ce9f https://medium.com/@harittweets/how-to-connect-to-h2-database-during-development-testing-using-spring-boot-44bbb287570 https://vladmihalcea.com/how-to-run-database-integration-tests-20-times-faster/ https://memorynotfound.com/unit-test-jpa-junit-in-memory-h2-database/ https://stackoverflow.com/questions/42943447/spring-boot-integration-test-with-h2-inmemory-database
In-memory database tests with Querydsl

Fortunately the EntityQueries abstraction is very easy to implement using POJO in-memory collections.

Test Doubles — Fakes, Mocks and Stubs

Fakes are objects that have working implementations, but not same as production one. Usually they take some shortcut and have simplified version of production code.


An example of this shortcut, can be an in-memory implementation of Data Access Object or Repository. This fake implementation will not engage database, but will use a simple collection to store data. This allows us to do integration test of services without starting up a database and performing time consuming requests.

Testing Real Repositories. When unit testing, do you have to use a database to test CRUD operations?. Data Access Component Testing Redux.

The most important thing to keep in mind is to avoid the temptation to create a General Fixture with some 'representative data' and attempt to reuse this across all tests. Instead, you should fill in data as part of each test and clean it up after.

https://news.ycombinator.com/item?id=18740246 Turning GraphQL diagrams to mock back end
farewell to fsync. lobsters.
verified fakes
From interaction-based to state-based testing. mocks for commands, stubs for queries. State vs Interaction Based Testing. An example of interaction-based testing in C#.
property-based testing. Building on developers' intuitions to create effective property-based tests.
don'w write tests. Find the best properties for Property Based Testing. Introduction to Property Based Testing. Building on developers' intuitions
testing in production. Testing in Production, the safe way.
JUnit 5: The Next Step in Automated Testing.
about consumer-driven contracts
Nicolas Frankel on application security, integration testing, Kotlin and more
Hypothesis for web developers. tweet.
80% CODE COVERAGE IS NOT ENOUGH.
Integrated versus Manual Shrinking
(State machine testing - LambdaJam 2018)[https://twitter.com/Jose_A_Alonso/status/1129325840662224902].
Testing Java Microservices: From Development to Production
nines
cloud native observability
test data generation with faker
docker for integration tests
most unit testing is a waste

storybook and react-testing

How to Specify it! A Guide to Writing Properties of Pure Functions

This is the most obvious approach to writing properties—to replicate the implementation in the test code—and it is deeply unsatisfyin

Real World Scenario Testing using Azure DevOps and automated UI tests
test telemetry. event log
state machine testing with Hedgehog. scala
Hedgehog vs Quickcheck
GOTO 2019 • Millisecond Full Stack Acceptance Tests • Aslak Hellesøy
runtime monitoring
Better Integration Tests for Performance Monitoring
Thoughts on efficient enterprise testing (1/6). 2.
tests that touch files
Testing in Production: the hard parts
testing sql

We have a ton of unit tests covering expected API/parser/renderer input/output pairs, but most functionality is covered in ~1000 integration tests running ~20k SQL queries against each supported RDBMS, mostly in-memory or Docker, sometimes VMWare run database instances.

Automation testing is not working disagree
Testing Microservices, the sane way
Conventional wisdom says you need a comprehensive set of regression tests to go green before you release code

  
## timeseries_n_iot.md

      
    Raw
  

              timeseries_n_iot.md
            
          
    Minimizing real-time prediction serving latency in machine learning https://cloud.google.com/solutions/machine-learning/minimizing-predictive-serving-latency-in-machine-learning

You have too many entities (high cardinality), which makes it challenging to precompute prediction in a limited amount of time. An example is forecasting daily sales by item when you have hundreds of thousands or millions items. In that case, you can use a hybrid approach, where you precompute predictions for the top N entities, such as for the most active customers or the most viewed products. You can then use the model directly for online prediction for the rest of the long-tail entities.

https://liqixu.github.io/papers/needletail-hilda.pdf Optimally Leveraging Density and Locality
for Exploratory Browsing and Sampling

, we would need B+Trees on every single attribute or combination of attributes

druid

At the time, we were handling approximately 100 millions events per day, and some of our reports were taking 30 seconds to generate. We currently handle billions of events per day, and the reporting takes less than 1 second most of the time.

https://hevodata.com/blog/druid-vs-redshift-data-warehouse/ https://towardsdatascience.com/introduction-to-druid-4bf285b92b5a
dynamo db

With NoSQL, it is best practice to precalculate aggregates values out of band, and store them back into the table as a single item for quick retrieval.


There are many data enrichment use cases that would fit this model.

Why you should use a relational database instead of NoSQL for your IoT application
Do we need pre-computed aggregates?

When querying time series data, resolution refers to the number of data points for a given time range. The highest resolution would provide every available data point for a time range. So if I want a query to use the highest resolution and if there are 100 data points, then the query result should include every one of those 100 points.


As the number of data points increases, providing results at higher resolutions becomes less effective. For instance, increasing the resolution to the point where a graph in the UI includes 1 million points is probably no more effective than if the graph included only 10,000 or even 1,000 data points. The higher resolution could degrade user experience as rendering time increases. Latency on server response time is also likely to increase.


Pre-computed aggregation is the process of continually downsampling a time series and storing the lower resolution data for future analysis or processing. Pre-computed aggregates are often combined with data expiration/retention policies to address the aforementioned storage problem. Higher resolution data is stored for shorter periods of time than lower resolution data. Pre-computed aggregation can also alleviate the CPU utilization and latency problems. Instead of downsampling 1 million data points, we can query the pre-computed aggregated data points and perform downsampling on 10,000 data points.

time series databases to watch

I have used TimescaleDB for several purposes. As it is built on top of Postgres, all the existing tools, libraries and processes work out of the box. This is a huge advantage if you are operating Postgres anyway: Your existing backup tools will work, as does your user Managment.


We ingest a lot of time series (IoT) data and use Postgres for other data so Timescale works quite well for us. One thing Timescale treats as second class citizen though is updates to existing data points which are 1-2 areas of magnitude slower than inserts. Granted this is also the case for all other TSDB solutions out there which are for obvious reasons optimized for inserts and reads aggregated along the time dimension. Still would be amazing if you could add to the already existing differentiation of allowing fast updates for cases like ours where we dont store events relating to a singular point in time but rather time-spans so new incoming data points might be "merged" into existing time-spans.

Narrow-table Model

In this model, each metric/tag-set combination is considered an individual "time series" containing a sequence of time/value pairs.


Using our example above, this approach would result in 9 different "time series", each of which is defined by a unique set of tags.


The number of such time series scales with the cross-product of the cardinality of each tag, i.e., (# names) × (# device ids) × (# location ids) × (device types). Some time-series databases struggle as cardinality increases, ultimately limiting the number of device types and devices you can store in a single database.


TimescaleDB supports narrow models and does not suffer from the same cardinality limitations as other time-series databases do. A narrow model makes sense if you collect each metric independently. It allows you to add new metrics as you go by adding a new tag without requiring a formal schema change.


TimescaleDB easily supports wide-table models. Queries across multiple metrics are easier in this model, since they do not require JOINs. Also, ingest is faster since only one timestamp is written for multiple metrics.


Of course, this is not a new format: it's what one would commonly find within a relational database.

Relational Database schema design for metric storage good SO answer
Schema Design for Time Series Data

For time series, you should generally use tall and narrow tables. This is for two reasons: Storing one event per row makes it easier to run queries against your data.

Timeseries: How long can the elephant remember?
Wide narrow data
BigQuery Best Practices For High Performance ETL
Continuous Queries in InfluxDB – Part I. more. Under the hood with Continuous Queries – Part II. downsampling and retention. Resolution 1: Downsample to get your database in shape.

Queries returning aggregate, summary, and computed data are frequently used in application development. For example, if you’re building an online dashboard application to report metrics, you probably need to show summary data. These summary queries are generally expensive to compute since they have to process large amounts of data, and running them over and over again just wouldn’t scale. Now, if you could pre-compute and store the aggregates query results so that they are ready when you need them, it would significantly speed up summary queries in your dashboard application, without overloading your database. Enter InfluxDB’s continuous queries feature!


Series cardinality is the number of unique database, measurement, and tag set combinations in an InfluxDB instance. We talk about it quite a bit because extremely high series cardinality can kill your InfluxDB process.

Using Data Transformations for Low-latency Time Series Analysis

While a row can have arbitrary number of fields, we encourage users to define their table schema as narrow tables, such as the OpenTSDB [8] table format: {metric, tags, time, value},

Is the EAV model still a decent way to store misc model data?

JSON columns. The peformance issues where fixed so there's no reason for EAV anymore

Re-architecting Slack’s Workspace Preferences: How to Move to an EAV Model to Support Scalability
Failed Solution II: Pre-compute the World in NoSQL

In short, we took all of our data and pre-computed aggregates for every combination of dimensions. At query time we need only locate the specific pre-computed aggregate and and return it: an O(1) key-value lookup. This made things fast and worked wonderfully when we had a six dimension beta data set. But when we added five more dimensions – giving us 11 dimensions total – the time to pre-compute all aggregates became unmanageably large (such that we never waited more than 24 hours required to see it finish).


So we decided to limit the depth that we aggregated to. By only pre-computing aggregates of five dimensions or less, we were able to limit some of the exponential expansion of the data. The data became manageable again, meaning it only took about 4 hours on 15 machines to compute the expansion of a 500k beta rows into the full multi-billion entry output data set.

What distinguishes the time series workload?

With time series databases, it’s common to keep high precision data around for a short period of time. This data is aggregated and downsampled into longer term trend data. This means that for every data point that goes into the database, it will have to be deleted after its period of time is up. This kind of data lifecycle management is difficult for application developers to implement on top of regular databases. They must devise schemes for cheaply evicting large sets of data and constantly summarizing that data at scale. With a Time Series Database, this functionality is provided out of the box.

How To Resample and Interpolate Your Time Series Data With Python. more

Downsampling: Where you decrease the frequency of the samples, such as from days to months.


Downsampling reduces the number of samples in the data. During this reduction, we are able to apply aggregations over data points.

on time series

Very often TS data is used to generate charts. This is an artifact of the human brain being spectacularly good at interpreting a visual representation of a relationship between streams of numbers while nearly incapable of making sense of data in tabular form. When plotting, no matter how much data is being examined, the end result is limited to however many pixels are available on the display. Even plotting aside, most any use of time series data is in an aggregated form.

Implementing Multidimensional Data Warehouses into NoSQL
mondrian aggregate tables
Time Series Aggregate Store. Read Time Series Data for Multiple Property Types of a Thing Applying M4 Algorithm

The Time Series Aggregate Store stores the pre-calculated aggregates for the time series data of Things


With this service, you can read the time series data for multiple property types of the specified thing by applying the M4 algorithm.

M4: A Visualization-Oriented Time Series Data Aggregation

M4 Aggregation. M4 is a composite value-preserving
aggregation (see Section 4.1) that groups a time series relation into w equidistant time spans, such that each group
exactly corresponds to a pixel column in the visualization.
For each group, M4 then computes the aggregates min(v),
max(v), min(t), and max(t) – hence the name M4

A Review of Aggregation Algorithms for the Internet of Things. Data aggregation mechanisms in the Internet of things. Efficiently Validating Aggregated IoT Data Integrity. Comparison of Data Aggregation Techniques in Internet of Things (IoT)
FAQs for Big Data & Analytics on Timeseries within SAP IoT Application Enablement (Leonardo Foundation)
Storing time-series data, relational or non?. . Is there a powerful database system for time series data? [closed]. storing massive ordered time series data in bigtable derivatives. Database and large Timeseries - Downsampling - OpenTSDB InfluxDB Google DataFlow

typical evaluations are hard to formulate in SQL and slow in the execution. E.g. find the maximum value with time stamp per 15 minutes for all measurements during the last month.


The need is since we can query vast amount of data, for instance a year, if the DB downsample at the query and is not pre-computed, it may take a very long time.


As well, downsampling needs to be "updated" when ever "delayed" datapoint are added.


For time series databases period aggregations (aka downsampling, averaging, summarization, etc) is one of standard use cases. They are all pretty good at it, at least the basics - avg, min, max, percentiles, first/last, etc. opentsdb for instance reads raw data and returns aggregates and then queues these aggregates for re-use. This is how it works last time I checked.

Time-series data: Why (and how) to use a relational database instead of NoSQL
time aggregation - DB2

Time aggregation is the aggregation of all data points for a single resource over a specified period (the granularity). Data aggregations in Resource Time Series reports are of the time aggregation type.


The result of the aggregation is one data point that reflects a statistical view of the collected and aggregated data points. For example, average, minimum, maximum, sum, or count. Typically, multiple aggregated data points are presented in a report for a given reporting period.

Benchmarking Time Series Databases with IoTDB-Benchmark for IoT Scenarios
What time series database can support high cardinality?
Procella: unifying serving and analytical data at YouTube

"For real-time tables, the user can also specify how to age-out, down-sample or compact the data"

Influxdb use case
Datadog
Why You Should NOT be Using an RDBS for Tme-Stamped Data
Redis and Grafana for real-time analytics
GOTO 2019 • Temporal Modelling • Mathias Verraes
high cardinality data and stuff
Influxdb on Kubernetes - should I be doing this?
Apache Druid vs. Time-Series Databases 
from batch to streaming to both
How to Get Started Using CrateDB and Grafana to Visualize Time-Series Data
things we learned about sums
	MetricsDB: TimeSeries Database for storing metrics at Twitter 
clickhouse

ClickHouse: New Open Source Columnar Database
clickhouse
interview about clickhouse
Raspberry Pi IoT: Sensors, InfluxDB, MQTT, and Grafana
How Netflix uses Druid
Handling Real-Time Updates in ClickHouse

Mutable data is generally unwelcome in OLAP databases.


Under the pressure of GDPR, requirements the ClickHouse team delivered UPDATEs and DELETEs in 2018.

	ClickHouse as an alternative to Elasticsearch for log storage and analysis
grafana datasources

grafana datasources

Graphite
Prometheus
InfluxDB
[Elasticsearch][https://grafana.com/docs/features/datasources/elasticsearch/)
Google Stackdriver
AWS CloudWatch
Azure Monitor
Loki
MySQL
PostgreSQL
Microsoft SQL Server (MSSQL)
OpenTSDB
Testdata

Timescaledb

Zabbix, Time Series Data and TimescaleDB. hn.

By clever sharding, you can work around the performance issues somewhat but it'll never be as efficient as an OLAP column store like ClickHouse or MemSQL:


Timestamps and metric values compress very nicely using delta-of-delta encoding.


Compression dramatically improves scan performance.


Aligning data by columns means much faster aggregation. A typical time series query does min/max/avg aggregations by timestamp. You can load data straight from disk into memory, use SSE/AVX instructions and only the small subset of data you aggregate on will have to be read from disk.

PG Partition Manager
timescale multi cloud
Building a distributed time-series database on PostgreSQL
TimescaleDB adds native compression for any PostgreSQL type

Reading this post is frustrating. What they are describing is where column store databases were 20 years ago. Perhaps at some point the folks at TimescaleDB will read Daniel Abadi’s 2008 paper, which describes the key elements of how all modern column stores work: http://db.csail.mit.edu/pubs/abadi-column-stores.pdf


The key takeaway is that columnar compression only accounts for a small minority of the speed up that you get for scan-oriented workloads; the real big win comes when you implement a block-oriented query processor and pipelined execution. Of course you can’t do this by building inside the Postgres codebase, which is why every good column store is built more or less from scratch.


Anyone considering a “time series database” should first set up a modern commercial column store, partition their tables on the time column, and time their workload. For any scan-oriented workload, it will crush a row store like Timescale.

Multi-node TimescaleDB is now free
ListenBrainz moves to TimescaleDB
	TimescaleDB vs. Amazon Timestream 
graphite

whisper

Whisper is a fixed-size database, similar in design and purpose to RRD (round-robin-database). It provides fast, reliable storage of numeric data over time. Whisper allows for higher resolution (seconds per point) of recent data to degrade into lower resolutions for long-term retention of historical data.

opntsdb

opend tdsd downsampling. Rollup And Pre-Aggregates. aggregators. Understanding Metrics and Time Series. Rollup And Pre-Aggregates IMPORTANT.

Downsampling (or in signal processing, decimation) is the process of reducing the sampling rate, or resolution, of data. For example, lets say a temperature sensor is sending data to an OpenTSDB system every second. If a user queries for data over an hour time span, they would receive 3,600 data points, something that could be graphed fairly easily. However now if the user asks for a full week of data they'll receive 604,800 data points and suddenly the graph may become pretty messy. Using a downsampler, multiple data points within a time range for a single time series are aggregated together with a mathematical function into a single value at an aligned timestamp. This way we can reduce the number of values from say, 604,800 to 168.


When storing rollups, it's best to avoid functions such as average, median or deviation. When performing further downsampling or grouping aggregations, such values become meaningless. Instead it's much better to always store the sum and count from which, at least, the average can be computed at query time. For more information, see the section below.


OpenTSDB was designed to efficiently combine multiple, distinct time series during query execution. The reason for this is that when users are looking at their data, most often they start at a high level asking questions like "what is my total throughput by data center?" or "what is the current power consumption by region?". After looking at these high level values, one or more may stick out so users drill-down into more granular data sets like "what is the throughput by host in my LAX data center?". We want to make it easy to answer those high level questions but still allow for drilling down for greater detail.


But how do you merge multiple individual time series into a single series of data? Aggregation functions provide the means of mathematically merging the different time series into one. Filters are used to group results by tags and aggregations are then applied to each group. Aggregations are similar to SQL's GROUP BY clause where the user selects a pre-defined aggregation function to merge multiple records into a single result. However in TSDs, a set of records is aggregated per timestamp and group.


This document focuses on how aggregators are used in a group by context, i.e. when merging multiple time series into one. Additionally, aggregators can be used to downsample time series (i.e. return a lower resolution set of results). For more information, see Downsampling.


While rollups help with wide time span queries, you can still run into query performance issues with small ranges if the metric has high cardinality (i.e. the unique number of time series for the given metric). In the example above, we have 4 web servers. But lets say that we have 10,000 servers. Fetching the sum or average of interface traffic may be fairly slow. If users are often fetching the group by (or some think of it as the spatial aggregate) of large sets like this then it makes sense to store the aggregate and query that instead, fetching much less data.


Notice that these time series have dropped the tags for host and interface. That's because, during aggregation, multiple, different values of the host and interface have been wrapped up into this new series so it no longer makes sense to have them as tags. Also note that we injected the new _aggregate tag in the stored data. Queries can now access this data by specifying an _aggregate value.


While pre-aggregates certainly help with high-cardinality metrics, users may still want to ask for wide time spans but run into slow queries. Thankfully you can roll up a pre-aggregate in the same way as raw data. Just generate the pre-aggregate, then roll it up using the information above.


One method that is commonly used by other time series databases is to read the data out of the database after some delay, calculate the pre-aggs and rollups, then write them. This is the easiest way of solving the problem and works well at small scales. However there are still a number of issues:

How to Handle the Influx of Data
timescale cloud
bitemporal data

https://martinfowler.com/bliki/DataLake.html
It is important that all data put in the lake should have a clear provenance in place and time. Every data item should have a clear trace to what system it came from and when the data was produced. The data lake thus contains a historical record. This might come from feeding Domain Events into the lake, a natural fit with Event Sourced systems. But it could also come from systems doing a regular dump of current state into the lake - an approach that's valuable when the source system doesn't have any temporal capabilities but you want a temporal analysis of its data. A consequence of this is that data put into the lake is immutable, an observation once stated cannot be removed (although it may be refuted later), you should also expect ContradictoryObservations.
https://martinfowler.com/bliki/ContradictoryObservations.html
https://blog.bi-geek.com/arquitectura-bi-introduccion-al-data-lake/
https://tdwi.org/articles/2017/12/04/arch-all-data-time-and-the-data-lake.aspx
Business state data tagged with business and DBMS start and end timestamps is called bitemporal data. This structure is one of the most useful ways that relational database designers record and manage time-related data. Several databases, including IBM DB2 and Teradata, have included internal support for bitemporal data since early this decade.
Bitemporality is at the heart of data warehouse consistency and enables operational systems to manage the creation of state data from time series. As relational databases are the basis of both operational systems and data warehouses, extensive design and development effort has been expended over the decades to handle time properly in this environment.
“The lake's one-dimensional time series approach can give rise to significant implementation challenges in more complex data warehouse use cases.” https://tdwi.org/articles/2017/12/04/arch-all-data-time-and-the-data-lake.aspx
https://www.elsevier.com/books/bitemporal-data/johnston/978-0-12-408067-6
https://www.dataversity.net/bitemporal-data-modeling-learn-history/
https://martinfowler.com/eaaDev/timeNarrative.html
https://en.wikipedia.org/wiki/Temporal_database
Bi-Temporal[edit]
A bi-temporal database has two axis of time.
valid time.
transaction time or decision time.
https://www.marklogic.com/blog/bitemporal/
https://www.sciencedirect.com/topics/computer-science/bitemporal-data
https://www.sciencedirect.com/science/article/pii/B9780123750419000029
https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.5.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0058476.html
A bitemporal table is a table that combines the historical tracking of a system-period temporal table with the time-specific data storage capabilities of an application-period temporal table. Use bitemporal tables to keep user-based period information as well as system-based historical information.
Bitemporal tables behave as a combination of system-period temporal tables and application-period temporal tables. All the restrictions that apply to system-period temporal tables and application temporal tables also apply to bitemporal tables.
https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.5.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0058481.html
https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.5.0/com.ibm.db2.luw.admin.dbobj.doc/doc/r0052344.html

  
## unicode.md

      
    Raw
  

              unicode.md
            
          
    http://manishearth.github.io/blog/2017/01/14/stop-ascribing-meaning-to-unicode-code-points/ Let’s Stop Ascribing Meaning to Code Points.
Breaking Our Latin-1 Assumptions http://manishearth.github.io/blog/2017/01/15/breaking-our-latin-1-assumptions/
UAX #29: Unicode Text Segmentation https://unicode.org/reports/tr29/
https://github.com/unicode-rs/unicode-segmentation https://github.com/unicode-rs/unicode-segmentation
Iterators which split strings on Grapheme Cluster or Word boundaries, according to the Unicode Standard Annex #29 rules.
http://site.icu-project.org/home
ICU is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications. ICU is widely portable and gives applications the same results on all platforms and between C/C++ and Java software.
http://userguide.icu-project.org/
http://userguide.icu-project.org/boundaryanalysis
http://userguide.icu-project.org/boundaryanalysis#TOC-Character-Boundary
https://stackoverflow.com/questions/40878804/how-to-count-grapheme-clusters-or-perceived-emoji-characters-in-java
https://engineering.linecorp.com/en/blog/the-7-ways-of-counting-characters/
https://softwareengineering.stackexchange.com/questions/13207/string-class-based-on-graphemes
https://news.ycombinator.com/item?id=13832831
Emoji.length == 2
This is most definitely not a solved problem, because graphemes (visual symbols) are a poor way to deal with unicode in the real world. Pretty much all systems either deal with the length in bytes (if they're old-style C), in code units / byte pairs (if they're UTF-16 based, like windows, java and javascript), or in unicode code points (if they're UTF-8 based, like every proper system should be). Dealing with the length in visual symbols is actually pretty much impossible in practice because databases won't let you define field lengths in graphemes.
The way things compose: bytes combine into code points (unicode numbers), and code points combine into graphemes (visual symbols). In UTF-16 for legacy compatibility reasons with UCS-2, code points decompose into code units (byte pairs), and high code points, which need a lot of bits to represent their number, need two code units (4 bytes) instead of one.
Java and JavaScript are UTF-16 based, so they measure length in code units and not code points. An emoji code point can be a low or high number depending on when it was added. Low numbers can be stored in two bytes, high numbers need four bytes. So an emoji can have length 1 or 2 in UTF-16. However, when moving to the database it will typically be stored in UTF-8, and the field length will be code points, not code units. So, that emoji will have a length of 1 regardless of whether it is low or high. You don't notice this as a problem because app-level field length checks will return a bigger number than what the database perceives, so no field length limits are exceeded.
There isn't any such thing as "characters" in code. In documentation when they say "characters" usually they mean bytes, code units or code points. Almost never do they mean graphemes, which is intuitively what people think they mean. The bottom line is two-fold: (A) always understand what is meant in documentation by "length in characters", because it almost never means the intuitive thing, and (B) don't try to use graphemes as your unit of length, it won't work in practice.
https://wiki.sei.cmu.edu/confluence/display/java/STR01-J.+Do+not+assume+that+a+Java+char+fully+represents+a+Unicode+code+point
The char data type is based on the original Unicode specification, which defined characters as fixed-width 16-bit entities. The Unicode Standard has since been changed to allow for characters whose representation requires more than 16 bits. The range of Unicode code points is now U+0000 to U+10FFFF. The set of characters from U+0000 to U+FFFF is called the basic multilingual plane (BMP), and characters whose code points are greater than U+FFFF are called supplementary characters. Such characters are generally rare, but some are used, for example, as part of Chinese and Japanese personal names. To support supplementary characters without changing the char primitive data type and causing incompatibility with previous Java programs, supplementary characters are defined by a pair of Unicode code units called surrogates. According to the Java API [API 2014] class Character documentation (Unicode Character Representations):
The Java platform uses the UTF-16 representation in char arrays and in the String and StringBuffer classes. In this representation, supplementary characters are represented as a pair of char values, the first from the high-surrogates range, (\uD800-\uDBFF), the second from the low-surrogates range (\uDC00-\uDFFF).
A char value, therefore, represents BMP code points, including the surrogate code points, or code units of the UTF-16 encoding. An int value represents all Unicode code points, including supplementary code points. The lower (least significant) 21 bits of int are used to represent Unicode code points, and the upper (most significant) 11 bits must be zero. Similar to UTF-8 (see STR00-J. Don't form strings containing partial characters from variable-width encodings), UTF-16 is a variable-width encoding. Because the UTF-16 representation is also used in char arrays and in the String and StringBuffer classes, care must be taken when manipulating string data in Java. In particular, do not write code that assumes that a value of the primitive type char (or a Character object) fully represents a Unicode code point. Conformance with this requirement typically requires using methods that accept a Unicode code point as an int value and avoiding methods that accept a Unicode code unit as a char value because these latter methods cannot support supplementary characters.
https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/maint/maint-67/icu4c/readme.html
https://unicode-org.github.io/icu-docs/#/icu4c/
https://bollu.github.io/mathemagic/declarative/index.html https://news.ycombinator.com/item?id=23231361
https://apps.timwhitlock.info/unicode/inspect?s=e%CC%81 unicode inspector
é <- this one is two codepoints
é <- this one isn’t
I thought it would be useful to share my personal list of scripts that break our Latin-1 assumptions. This is a list I mentally check against whenever I am attempting to reason about text. I check if I’m making any assumptions that break in these scripts. Most of these concepts are independent of Unicode; so any program would have to deal with this regardless of encoding.
I again recommend going through eevee’s post, since it covers many related issues. Awesome-Unicode also has a lot of random tidbits about Unicode.
https://en.wikipedia.org/wiki/Arabic_alphabet#Table_of_basic_letters
https://www.gutenberg.org/catalog/
https://en.wikipedia.org/wiki/Devanagari_(Unicode_block)
https://stackoverflow.com/questions/6805311/combining-devanagari-characters
http://unicode.org/faq/char_combmark.html
http://unicode.org/faq/
http://www.unicode.org/versions/Unicode6.0.0/ch04.pdf
https://en.wikipedia.org/wiki/Unicode_block

Grapheme clusters: how many of what end users might consider "characters". In this example, the Devanagari syllable "ni" must be composed using a base character "na" (न) followed by a combining vowel for the "i" sound ( ि), although end users see and think of the combination of the two "नि" as a single unit of text. In this sense, the example string can be thought of as containing 4 “characters” as end users see them. A default grapheme cluster is specified in UAX #29, Unicode Text Segmentation, as well as in UTS #18, Unicode Regular Expressions.

The choice of which count to use and when depends on the use of the value, as well as the tradeoffs between efficiency and comprehension. For example, Java, Windows, and ICU use UTF-16 code unit counts for low-level string operations, but also supply higher level APIs for counting bytes, characters, or denoting boundaries between grapheme clusters, when circumstances require them. An application might use these to, say, limit user input based on a number of "screen positions" using the user-perceived "character" (grapheme cluster) count. Or the application might have an internal limit based on storage allocation in a database field counted in bytes. This approach allows for efficient low-level processing, with allowance for higher-level usage. However, for a very high-level application, such as word-processing macros, grapheme clusters alone may be sufficient.
https://emojipedia.org/
https://blog.emojipedia.org/what-the-2021-unicode-delay-means-for-emoji/
https://emojipedia.org/emoji-zwj-sequence/
https://emojipedia.org/eye-in-speech-bubble/
https://emojipedia.org/emoji-flag-sequence/
https://blog.emojipedia.org/emoji-zwj-sequences-three-letters-many-possibilities/
https://en.wikipedia.org/wiki/Unicode_block
A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.
Each block is generally, but not always, meant to include all the glyphs used by one or more specific languages, or in some general application area such as mathematics, surveying, decorative typesetting, social forums, etc.
Unicode blocks are identified by unique names, which use only ASCII characters and are usually descriptive of the nature of the symbols, in English; such as "Tibetan" or "Supplemental Arrows-A". (When comparing block names, one is supposed to equate uppercase with lowercase letters, and ignore any whitespace, hyphens, and underbars; so the last name is equivalent to "supplemental_arrows__a" and "SUPPLEMENTALARROWSA".[1]
Blocks are pairwise disjoint, that is, they do not overlap. The starting code point and the size (number of code points) of each block are always multiples of 16; therefore, in the hexadecimal notation, the starting (smallest) point is U+xxx0 and the ending (largest) point is U+yyyF, where xxx and yyy are three or more hexadecimal digits. (These constraints are intended to simplify the display of glyphs in Unicode Consortium documents, as tables with 16 columns labeled with the last hexadecimal digit of the code point.[1]) The size of a block may range from the minimum of 16 to a maximum of 65,536 code points.
Every assigned code point has a glyph property called "Block", whose value is a character string naming the unique block that owns that point.[2] However, a block may also contain unassigned code points, usually reserved for future additions of characters that "logically" should belong to that block. Code points not belonging to any of the named blocks, e.g. in the unassigned planes 3–13, have the value block="No_block".[1]
https://unicode.org/emoji/charts/full-emoji-list.html unicode emojis, grapheme clusters
https://stackoverflow.com/questions/40878804/how-to-count-grapheme-clusters-or-perceived-emoji-characters-in-java
https://users.rust-lang.org/t/how-to-iterate-over-emojis-grapheme-clusters/14254
https://users.rust-lang.org/t/how-to-iterate-over-emojis-grapheme-clusters/14254/4
https://hsivonen.fi/string-length/ <- this is awesome!!!!!
https://blog.jonnew.com/posts/poo-dot-length-equals-two but, is that true?
https://stackoverflow.com/questions/54369513/how-to-count-the-correct-length-of-a-string-with-emojis-in-javascript
For example, the character encoding scheme ASCII comprises 128 code points in the range 0hex to 7Fhex, Extended ASCII comprises 256 code points in the range 0hex to FFhex, and Unicode comprises 1,114,112 code points in the range 0hex to 10FFFFhex. The Unicode code space is divided into seventeen planes (the basic multilingual plane, and 16 supplementary planes), each with 65,536 (= 216) code points. Thus the total size of the Unicode code space is 17 × 65,536 = 1,114,112.
Why is Java’s primitive “char” designed to respond to 1 code unit of UTF-16 instead of 1 grapheme or 1 code point? Because when Java was first designed, Unicode’s entire code points were defined in 16 bit.
The concept of “encoding every character in 16 bits” was something that the original designers of Unicode were proud enough to include in their design principles.(Not long after Java was announced, Unicode was expanded beyond 16 bits. As of Unicode 7.0, It is defined as U+10FFFF, or 17*65536=1,114,112.) Meanwhile, MySQL or Oracle’s “utf8” charset is more closer to CESU-8 than it is to UTF-8, possibly requiring more space. When encoding in UTF-8, charsets “AL32UTF8” (Oracle) or “utf8mb4” (MySQL) must be used. Swift, one of the most recent programming languages, is defined so that a character type is expressed as 1 grapheme.
https://en.wikipedia.org/wiki/Plane_(Unicode)
Planes are further subdivided into Unicode blocks, which, unlike planes, do not have a fixed size. The 308 blocks defined in Unicode 13.0 cover 26% of the possible code point space, and range in size from a minimum of 16 code points (fifteen blocks) to a maximum of 65,536 code points (Supplementary Private Use Area-A and -B, which constitute the entirety of planes 15 and 16). For future usage, ranges of characters have been tentatively mapped out for most known current and ancient writing systems.[4]
https://www.w3.org/International/articles/definitions-characters/
UTF-32
From Wikipedia, the free encyclopedia
Jump to navigationJump to search
UTF-32 (32-bit Unicode Transformation Format) is a fixed-length encoding used to encode Unicode code points that uses exactly 32 bits (four bytes) per code point (but a number of leading bits must be zero as there are far fewer than 232 Unicode code points).[citation needed] UTF-32 is a fixed-length encoding, in contrast to all other Unicode transformation formats, which are variable-length encodings. Each 32-bit value in UTF-32 represents one Unicode code point and is exactly equal to that code point's numerical value.
The main advantage of UTF-32 is that the Unicode code points are directly indexed. Finding the Nth code point in a sequence of code points is a constant time operation. In contrast, a variable-length code requires sequential access to find the Nth code point in a sequence. This makes UTF-32 a simple replacement in code that uses integers that are incremented by one to examine each location in a string, as was commonly done for ASCII.
The main disadvantage of UTF-32 is that it is space-inefficient, using four bytes per code point, including 11 bits that are always zero. Characters beyond the BMP are relatively rare in most texts, and can typically be ignored for sizing estimates. This makes UTF-32 close to twice the size of UTF-16. It can be up to four times the size of UTF-8 depending on how many of the characters are in the ASCII subset.
Though a fixed number of bytes per code point seems convenient, it is not as useful as it appears. It makes truncation easier but not significantly so compared to UTF-8 and UTF-16 (both of which can search backwards for the point to truncate by looking at 2–4 code units at most).
It is extremely rare[citation needed] that code wishes to find the Nth code point without earlier examining the code points 0 to N–1. For instance, XML parsing cannot do anything with a character without first looking at all preceding characters.[4] So an integer index that is incremented by 1 for each character can be replaced with an integer offset, measured in code units and incremented by the number of code units as each character is examined. This removes the perceived speed advantages[citation needed] of UTF-32.
Each hexadecimal digit represents four binary digits, also known as a nibble, which is half a byte. For example, a single byte can have values ranging from 00000000 to 11111111 in binary form, which can be conveniently represented as 00 to FF in hexadecimal.
In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes".[1] The very last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 13.0, seven of the planes have assigned code points (characters), and five are named.
The limit of 17 planes is due to UTF-16, which can encode 220 code points (16 planes) as pairs of words, plus the BMP as a single word.[2] UTF-8 was designed with a much larger limit of 231 (2,147,483,648) code points (32,768 planes), and can encode 221 (2,097,152) code points (32 planes) even under the current limit of 4 bytes.[3]
The 17 planes can accommodate 1,114,112 code points. Of these, 2,048 are surrogates (used to make the pairs in UTF-16), 66 are non-characters, and 137,468 are reserved for private use, leaving 974,530 for public assignment.
https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/
Every platonic letter in every alphabet is assigned a magic number by the Unicode consortium which is written like this: U+0639.  This magic number is called a code point. The U+ means “Unicode” and the numbers are hexadecimal. U+0639 is the Arabic letter Ain. The English letter A would be U+0041. You can find them all using the charmap utility on Windows 2000/XP or visiting the Unicode web site.
There is no real limit on the number of letters that Unicode can define and in fact they have gone beyond 65,536 so not every unicode letter can really be squeezed into two bytes, but that was a myth anyway.
Well, technically, yes, I do believe it could, and, in fact, early implementors wanted to be able to store their Unicode code points in high-endian or low-endian mode, whichever their particular CPU was fastest at, and lo, it was evening and it was morning and there were already two ways to store Unicode. So the people were forced to come up with the bizarre convention of storing a FE FF at the beginning of every Unicode string; this is called a Unicode Byte Order Mark and if you are swapping your high and low bytes it will look like a FF FE and the person reading your string will know that they have to swap every other byte. Phew. Not every Unicode string in the wild has a byte order mark at the beginning.
Almost every stupid “my website looks like gibberish” or “she can’t read my emails when I use accents” problem comes down to one naive programmer who didn’t understand the simple fact that if you don’t tell me whether a particular string is encoded using UTF-8 or ASCII or ISO 8859-1 (Latin 1) or Windows 1252 (Western European), you simply cannot display it correctly or even figure out where it ends. There are over a hundred encodings and above code point 127, all bets are off.
https://en.wikipedia.org/wiki/Universal_Character_Set_characters
The UCS uses surrogates to address characters outside the initial Basic Multilingual Plane without resorting to more than 16 bit byte representations. There are 1024 "high" surrogates (D800–DBFF) and 1024 "low" surrogates (DC00–DFFF). By combining a pair of surrogates, the remaining characters in all the other planes can be addressed (1024 × 1024 = 1048576 code points in the other 16 planes). In UTF-16, they must always appear in pairs, as a high surrogate followed by a low surrogate, thus using 32 bits to denote one code point.
http://unicode.org/faq/char_combmark.html
https://en.wikipedia.org/wiki/Duplicate_characters_in_Unicode
https://en.wikipedia.org/wiki/Unicode_equivalence
Unicode equivalence is the specification by the Unicode character encoding standard that some sequences of code points represent essentially the same character. This feature was introduced in the standard to allow compatibility with preexisting standard character sets, which often included similar or identical characters.
Unicode provides two such notions, canonical equivalence and compatibility. Code point sequences that are defined as canonically equivalent are assumed to have the same appearance and meaning when printed or displayed. For example, the code point U+006E (the Latin lowercase "n") followed by U+0303 (the combining tilde "◌̃") is defined by Unicode to be canonically equivalent to the single code point U+00F1 (the lowercase letter "ñ" of the Spanish alphabet). Therefore, those sequences should be displayed in the same manner, should be treated in the same way by applications such as alphabetizing names or searching, and may be substituted for each other. Similarly, each Hangul syllable block that is encoded as a single character may be equivalently encoded as a combination of a leading conjoining jamo, a vowel conjoining jamo, and, if appropriate, a trailing conjoining jamo.
The standard also defines a text normalization procedure, called Unicode normalization, that replaces equivalent sequences of characters so that any two texts that are equivalent will be reduced to the same sequence of code points, called the normalization form or normal form of the original text. For each of the two equivalence notions, Unicode defines two normal forms, one fully composed (where multiple code points are replaced by single points whenever possible), and one fully decomposed (where single points are split into multiple ones).
https://hackage.haskell.org/package/text-utf8
https://www.smashingmagazine.com/2012/06/all-about-unicode-utf8-character-sets/
The High Surrogate (U+D800–U+DBFF) and Low Surrogate (U+DC00–U+DFFF) codes are reserved for encoding non-BMP characters in UTF-16 by using a pair of 16-bit codes: one High Surrogate and one Low Surrogate. A single surrogate code point will never be assigned a character.
https://en.wikipedia.org/wiki/UTF-16#U+10000_to_U+10FFFF
https://en.wikipedia.org/wiki/Category:Unicode_formatting_code_points
All characters satisfying a given condition, using properties defined in the Unicode Character Database [UCD]:               https://unicode.org/reports/tr41/tr41-26.html#UCD
https://unicode.org/reports/tr29/
https://www.gutenberg.org/catalog/
https://www.win.tue.nl/~aeb/linux/uc/nfc_vs_nfd.html
Roughly speaking, NFC is the short form, fully composed, like U+1F85, and NFD is the long form, fully decomposed, in some well-defined order, like U+03B1 U+0314 U+0301 U+0345. (These are the two non-lossy normal forms
The rules for grapheme clusters can be easily converted into a regular expression, as in Table 1b, Combining Character Sequences and Grapheme Clusters. It must be evaluated starting at a known boundary (such as the start of the text), and it will determine the next boundary position. The resulting regular expression can also be used to generate fast, deterministic finite-state machines that will recognize all the same boundaries that the rules do.
https://hackage.haskell.org/package/text
Currently the text library uses UTF-16 as its internal representation which is neither a fixed-width nor always the most dense representation for Unicode text. We're currently investigating the feasibility of changing Text's internal representation to UTF-8 and if you need such a Text type right now you might be interested in using the spin-off packages text-utf8 and text-short.