jberthold/Day-4.md

## Day-4.md

      
    Raw
  

              Day-4.md
            
          
    Notes at SPLASH 2022 - 20221205-10 Auckland NZ

Thu 8 12 : Splash 22 main tracks

c.mann@regmaster.com
Keynote: Mary Shaw "What does it mean to be a programming language"

Mary Shaw: of SWE fame, foundational work in modularity and SW architecture

(this talk: reprise of a HOPL 22 talk/paper)

Intro


myths as a narrative to make sense of the human condition


myths in software


software = composing programs from spec in sound languages and
verifying them, then composing components in an ordered manner


Actually programmers try to solve problems, and the "software"
is just the code.


We should look at these phenomena with a clear mind and treat them
just like our code


"Professional programmer" personas:


Frieda, iterating code-and-fix each year with a budget spreadsheet
Scientific Programmer: incrementally building an analytic program
from stepwise application and inspection
Photoshop scripting: a "replay" is supported but is barely usable
due to scope issues and environment effects

The code is the software:


coding is a very small part of programming


scripting and multi-language interfaces prevail


systems are "coalitions"


technology ecosystems exist (-> stackoverflow analysis screenshot)
and are often dominated by non-rigorous centerpieces


Notation of SW system architecture not well-supported in languages


tools for "vernacular developers" are sparse and inadequate
-> Screenshot about "low-code"


Programming Language Purity


"Mathematical tractability myth":

trade-off between generality and power (-> the power curve):

DSLs trade generality for power. Support for particular
domain-specific tasks come as restrictions to what can be done.


About UML (as an expression of "mathematical tractability"!)


collection of notations, each had a purpose and a formal basis

turned out to just be incomplete
did not handle interactions between several notations


about badly designed languages
-> Fred Brooks: "Design of Design" (new book from a few years back)

Brooks: "JCL [Job Control Language] was indeed a language, but not
perceived as such by its designers"


Correctness myth:


"Software can be provably correct" and "specifications are
sufficient, complete, static, purely functional"


In practice, correctness proofs are (usually) too expensive.


specifications are incomplete, "credentials" (what we know about
the code) are incomplete and evolving. 'Non-functional properties'
(should be called "extra-functional")


Correctness is replaced by "Fitness for a purpose"


real software is uncertain. Even verified modules execute in
unverified environments.


software component versioning (in cloud), physical system
environments, comprehension of requirements, ..


Specification myth


Foll specification is possible, and should come first


contrast with the scientific programming activity

spot checks, hardly-guided iteration of versions


"exploratory programming" (-> Donald Schoen)


The AI (or ML) myth


AI requires an entirely new programming paradigm (?)

in reality, AI experimentation often provided impulses to SWE,
whose origins were forgotten


collection and curation of large datasets


ML dependence on data curation: incorrectness, format errors, inconsistency


data provenance techniques and processes are necessary


Things we say about AI but which apply to other software, too


understandability


non-determinism


lack of specification


unpredictable effects (hidden dependencies, feature interaction)


criteria for correctness


Wrap-up: Myths should inspire us, not hold us captive

  
Assurance session

C to checked C by 3C

(Aravind Machiri)


Checked-C's restricted pointer types characterising the pointed-at
structure (array,singleton, null-terminated)

no pointer arithmetic allowed for singletons, automatic null check
array indexing not allowed unless bounds known, dynamic checks
added when bound annotation present
null-terminated arrays have automatic bounds management,
arithmetic is allowed (statically checked)


Other features: interface types, checked/unchecked scope, safe casts, ..
  
  
goal here: annotate automatically to avoid spending developer effort
  

inference of checked types and bounds, mostly automatic and asking
for human help for hard cases


inference: interesting case of "wild" pointers, approach tries to
localise their use
two constraint sets to do the two analyses (wild scope, checked
annotation)


bounds inference: (-> correlation analysis)

associate all pointers with bounds from creation
propagate bounds to all use sites (transitively, pointer flow
graph and scalar flow graph)
interesting case: context sensitivity (in structs)


Heuristics for indefinite cases

consistent upper bound (all uses have a common variable check)
name prefix
adjacent method parameter


for unsolved cases:

find root cause for wild pointer inference and ask human
tool provides an IDE integration to auto-apply fixes


Questions:

Perf. evaluation after conversion, overhead

no evaluation, but impact expected to be low (maybe 10% max)


what if heuristics yield wrong checks

a report is generated that can be inspected. a second run of the
checker will detect wrong annotations.


runtime checks are inserted. does it lead to unwanted crashes?

yes. should probably be modified.


what if/when you cannot convert wild pointers?

major reason: macro expansion (considered wild).


Splash Onwards Essays

Essay about virtual vs "terrestrial" conferences


reporting about a "PLOP conference" (writing workshop) which is a
particular forum in a dedicated place (like Dagstuhl seminars). Few
conferences are like that.
Fair point, and apparently backed by research: unfamiliar
environments instigate creative impulses.

Ole Lehrmann Madsen: What OO programming was supposed to be


Intro:

dominance of OO programming vs. blog posts etc (varying
competence) making claims that it causes us to suffer
more fundamental critique: original ideas got diluted and neglected
Original goal: modelling, not inheritance (code reuse)

Beta language setting out to deliver both
explained time and again, even as easy as the kindergarten knowledge
yet many textbooks and toturials do not make it central, maybe not
even explicit.
focus on code reuse is misguided/misguiding (meaning inheritance)


Benefit of modeling is lost when reuse dominates

typical mistakes include false inheritance (stack < vector)
methods being "virtual" became default in a few languages,

going fundamentally against the idea of modelling the same
behaviours for objects of a family
provides maximum flexibility for code reuse
but does not enforce the idea of "same behaviour"


Alan Kay: essence of OO is messages between objects (inheritance
secondary)
Programming is modelling

missing aspect of programming curriculum (and textbooks) today


phenomena vs concepts

"extends" extends the intensional aspect, explicitly not the
extension


Keynote 2: Robert O'Callahan, The state of Debugging in 2022

Problem space, Historical development


1987: breakpoints, source view, stepping, variable inspection
2022: printf debugging prevails
What are the problems, why can't we solve them?

The problem with classical debugging


pause, inspect, continue loop

some software just cannot be stopped, or does unrealistic things
when stopped
requires iterative re-runs to trace the root cause of wrong data

non-deterministic bugs are hard to observe in this way


record execution to enable revisiting identical state sequence
drawback: overhead, impractical to apply to production sized data,
cannot be always-on


single program state display

vs. the need to understand change over time of values in the state
"singlestepping is always a sign of inadequate debugger functionality"
solution: query-based and omniscient debugging

collect detailed information about (all or) interesting data
over time, and a way to present and query it (DB backed IDE)
upfront cost, unfamiliar interface,


Siloed debuggers

usually specific to a language, new interface must be learned
alleviated by IDE unified debugger interface presentation with plugins
multi-language projects are extremely hard to work with


debuggers are often an afterthought

tools lag behind the evolution of the language and ecosystem
examples: cloud-deployed apps, optimised code (and relying on
optimisations), novel languages features, special build requirements


Cultural problems (reputation of debuggers)

debuggers often don't work, people give up on them


Could interactive debugging be bad inherently?


can we just use logging?

seems to work, and avoids some technical barriers
requires iterations adding log statements, potential
non-determinism, problem of data volume, change of program behaviour
cannot track dataflow (only captures single states)


debugging is hard, so ..

writing small tests (requires mocking etc.)
disabling flaky tests instead of fixing => loss of coverage
better debuggers would address this


Where from here?


better understanding of the debugging process

time to understand the root cause and what/how tools help with


pervasive always-on recording (need to solve technical issues)

does not solve inherent non-determinism (data races)


better or broader?

"the PII problem": cannot debug real production data


ICFP papers 2020/2021

Certifying the Synthesis of Heap-Manipulating Programs

(Ilya Sergey)


using separation logic to express disjoint segments of heap


example: copying a singly-linked list

using Suslik (generates a program from a separation logic spec)


this paper: providing proof certificate derived from insights during
synthesis


how Suslik works

-> "Synthetic separation logic": gradual reduction of a goal,
emitting program parts as a by-product.
problem: cannot rely on program structure to leave post-condition
unchanged (the program is being composed on the side)

if post-condition is transformed, "delay their insight"


using one of several possible Sep. Logic verifier lib.s in Coq.


Contributions: implemented an abstract proof eval framework, 3
instances, evaluation on characteristic benchmarks


example walk-through -> screenshots

different kinds of steps
delayed post-condition modifications
carrying a proof context to store unifications with variables


"Kindly bent to free us"

(Peter Thiemann)


prototype language, resource mgmt. using kinds and constrained types


concept: distinguish kinds of use (linear, affine, unrestricted) for
variables.

automatically inferred during type inference, no annotations required


paper goal: automate parts of Rust's analysis (which is fully-manual
by default)


(small) kind system with subkinding and polymorphism


functions and capture

captured linear variables -> function usage mode restricted to linear


"borrowing" concept, distinguishing reads and writes

borrows have a kind of their own. Shared borrows = unrestricted,
exclusive borrows = affine
important distinction between consuming use and (exclusively)
borrowed use
restrictions on simultaneous borrows (and direct use) and on
scoping of borrows (which includes nesting, complicated)
uses sophisticated type simplifications to get rid of all
subkinding constraints accumulated through region nesting and
function usage restrictions


Questions:

Elaborate on similarities with and differences to Rust

all essential concepts from Rust are present
however, this work comes from a different background (FP)


"Stable relations and abstract interpretation of higher-order programs"

(Benoit Montagu)


"stable relations" as opposed to deriving outputs classically from
given input

unnecessary inputs won't change anything (stability)


environment of a denotational semantic interpretation stores stable
relations

compositional interpretation, result is stable


straightforward extensions for pairs and projections


interpretation of lambda terms

all-quantified over both all possible outcome pairs and all input
subsumptions (to achieve stability)


implemented as a POC for stable relations


## list-of-slide-screenshots.md

      
    Raw
  

              list-of-slide-screenshots.md