These are the notes from the talks presented at the 2nd MEPLS meeting.

Graph Drawing in a Functional Domain Specific Embedded Language

Speaker: Chris Gaconnet, University of North Texas

This talk presents the speaker's ongoing thesis work, including goals, approach, challenges, and prospectives.

Goals.

  • Apply FRP to real problem.
  • Functional images DSEL for programmatic, scalable shapes
  • Compare DSEL vs DSL
    • e.g. DOT is stand-alone, what if we embed it into Haskell for greater contorl?

FRP is good for declaratively modeling time-varying values.

type Behavior alpha = Time -> alpha
type Event alpha = [(Time, alpha)]  -- list of time-stamped values

switcher :: B a -> E (E a) -> B a
stepper :: a -> E a -> B a

moveXY :: Real -> Real -> a -> a
-- "moveXY x y foo" moves object foo to coordinate (x, y)

time :: B Real
time = id -- retrieve the time

-- Create two moving marble objects
lrBlueMarb = moveXY wiggle 0 blueMarb
udRedMarb = moveXY 0 waggle redMarb

-- Put them on the same screen
marbDance = lrBlueMarb `over` udRedMarb

Abstract out velocity, position, energy (force), etc with FRP

Switching gears, let's consider making a DS(E)L for creating graphs.

It's common for someone to write a code generator for graphviz, because graphviz itself is not capable of generating graphs.

Idea (functional images): model the graph as functions.

type Point = (Float, Float)
type Image a = Point -> a

type RGBA = (Float, Float, Float, Float)
type ImageC = Image RGBA   -- (= Point -> RGBA)

This is like a bitmap, but by representing it as a function from float-coordinates, it becomes scalable like vector graphics---you just invoke the function at higher floating-point precision to render a larger image or with greater precision.

Problem: how do we detect overlaps and stuff? With vector images (like Bezier curves), there are established techniques like:

  • bounding box
  • convex hull
  • approximating polytopes
Can we find analogous techniques for functional images?

Questions

Q: Hard to determine boundary; how about using sampling?

A: I'm thinking of that.

Q: Graph is a completely different approach than images. You want to find a way to integrate them.

Q: There's been a similar approach tried in Java. Functional programming has stayed away from graphics.

Q: No reason we can't have an OO representation of images.

Q: FP has huge efficiency penalty. FP doesn't have computable objects preclude a lot of optimizations.

Q: If you want to compose two images, you write one into the other. In Haskell, you'll build a new copy, but it's like you're sharing (?). Postscript is a functional representation of images and it's immensely useful.

Q: In the early days of PS, there were noticeable overheads.

A: Efficiency is going to be a problem here.

Q: Electro-magnetic repulsion model?

Verilog Preprocessor (VPP)

Speaker: Cherif Salama, Rice University

Verilog is a hardware description language.

Hardware is getting bigger and more complex. Dimensions of circuits that we want to analyze:

  • gate counts
  • wire delays
  • power consumption
  • etc.

Vision: say no to automation. Automation takes away a lot of opportunities for control from you.

Two different Verilog subsets: structural vs behavioral. Structural describes the bare wires; behavioral states what the functionality of the circuit shall be, without saying much about how it's realized. Using behavioral means you're losing control.

EEs think this approach is mad.

Q: Why do EEs think you're mad?

A: People are going in the direction of using higher and higher-level descriptions.

Q: You're saying hardware is getting more complex. Why does making them think more of the details help?

A: Designers know a lot about their circuits. Letting tools make decisions for you can result in circuits that don't meet your requirements.

Enhanced generative constructs.

Example: n-bit ripple adder. You don't want to repeat the instantiation of full adders n times when you need to stack n full adders. A for-loop gives you the kind of code generation that you want.

Problem with generative constructs is it can make the description unrealizable as hardware (unsynthesizable), unlike purely structural descriptions.

Q: this generative construct already exists in Verilog and you're adding checking?

A: Yes.

General synthesizability: after elaboration (executing the generative constructs), do we have a synthesizable circuit?

Generative is like multi-stage programming.

Featherweight Verilog (FV) is a calculus that models Verilog. We have formalized elaboration semantics with this calculus. There's a type system that ensures synthesizability.

We want to make sure that a module instantiation doesn't depend on the value passing through a wire at "run"time (since that's unsynthesizable).

Static consistency checking. In current Verilog, inconsistencies (assigning an array of wires of size n to an array of size k, with n not equal to k) are padded.

Q: Why was such a semantics chosen?

A: Original language designers decided this was useful/convenient. It turned out to be a major source of programmer error.

Array-out-of-bounds is detected right now at elaboration time, after you've invested quite a bit of CPU time into it. We want to detect problems statically.

Approach: Dependent types. Mark each array with its size, represented symbolically. Array references generate constraints (inequalities) that must be satisfied by the size of each array. You check with an SMT solver (Yices) that those constraints cannot be violated.

As a freebie, we get unreachable code detection. If a statement generates unsatisfiable constraints (as opposed to unviolatable ones) we know that that part of the program is unreachable.

Static gate-count estimation. We generate a parametric, symbolic expression of the estimated number of gates used by a module. We tried this on a bunch of simple circuits, and it turned up an exact assessment of the gate counts.

Future work:

  • More examples
  • Static delay estimation
    • tradeoffs with gate count
  • more expressive abstractions
    • what's possible?
    • recursive, higher-order modules?

Questions

Q: Throwing FP concepts at hardware sounds really cool. What's the reaction from Intel people?

A: Our collaborators from Intel are actually PL people so they're familiar with FP.

Q: Tradeoff: In the past, they just did the synthesis. What's the benefit of this approach.

A: Elaboration takes a long time.

Q: Couldn't you just stop elaboration before you do optimizations etc and do synthesizability checks quickly?

A: perhaps.

Q: you can give an estimate that it takes so many of my adders. Delays may depend on the kind of chip you're working with. Not about ASIC, but like FPGA.

Usually with FPGAs, you give a higher-level description and the proprietary synthesizer decides what to do. There are ways to force how an FPGA does things, basically by recreating another FPGA on top of it.

The framework is there in Cherif's work. You estimate something that is related to actual resources.

Q: Have you looked at Lava or other FP-based stuff?

A: I've looked at some of them, including Lava, but Verilog is already an established tool.

Lava is part of inspiration. Lava doesn't have all these static checks, though. That's the strength of this work. We need to hide things under the carpet to make this a Verilog++.

With static checks, it's easier to pinpoint the source of error too, compared to post-elaboration analyses.

Q: do you think adding lava-like features would complicate the analyses?

A: higher-order modules at least sounds easy.

Walid: I expect some things will break.

William: you guys like the structural stuff because it's two-level.

Q: I have the impression that Verilog is a huge pile without sensible semantics. You're trying to build a verilog that has some real properties.

A: Yes.

William: Verilog is really just a graph construction language. That's what structural is.

Walid: so the best thing about Verilog is that the subset needed to describe the graph (structural) is tiny. Might be a connection with the first two talks.

Q: VHDL solves some of the problems you tackle.

Walid: Not always. It doesn't do everything you want to do either.

New Foundations for OO Languages

Speaker: Robert "Corky" Cartwright, Rice University

William: Operational language is simple and good, but it's like a hack. It's like assembly language.

Motivation. We need semantics models for languages. After 50 years of PL research, we have very sophisticated models but it's kind of unbalanced. They're mostly for static (mostly) functional languages, and the models for languages like Java or C# is very weak.

Denotational semantics has been out of favor for the last few decades.

Anonymous delegates in C# is Corky's contribution and it's essential for elegantly coding up prime-sieve using lazy streams.

Java generics is a mess, mainly because of the erasure semantics which makes generics not first-class. Wildcard types are also hard to wield. With a close inspection of the type inference system, it's just broken. On the other hand, these are very useful.

What should first-class generics look like?

Semantics of Java is built on Cardelli's denotational models but nominally typed languages don't play nicely with them.

Secondary motivation: domain theory is fun. Corky wants to see it revived.

Denotational semantics is foreign to most students these days, but a lot of researchers work off of intuitions obtained from models.

Operational semantics doesn't provide insight on pieces of programs; only for complete programs.

Q: Mathias' work does talk about equality of expressions out of context.

A: Connections between operational and denotational is important. You can have notions of equivalence on pieces of programs with operational, yes, but denotational is still useful.

Denotational semantics and model theory are closely connected. Model theory gives a good picture of the mathematical properties.

Denotational semantics starts with the lambda-calculus (no surprises there). Lambda-calculus was really about defining things more than computing things.

Denotational semantics for lambda-calculus give a mapping from lambda terms to a mathematical structure, the Scott domains. To construct it, we need to solve the domain equation D = D -> D, which is impossible on the face of it due to cardinality considerations.

Observation: not all functions from D to D are topologically well-formed. Computation can't take back the part of the result that it has already produced, but some set-theoretical functions do. By removing some of those set-theoretic functions, we might be able to reduce D->D to have the same cardinality as D. Monotonicity and continuity capture this idea.

  • monotonicity: If d1 <= d2 (information content of d2 is a superset of that of d1) then f(d1) <= f(d2)
  • continuity

Taking the set of continuous functions doesn't increase cardinality, unlike the set of all set-theoretic functions.

Scott showed general techniques for solving domain equations. Cardelli took them and developed a theory of inheritance.

Types in domain theory: domain equations contain domain constructions that require a notion of types. Unlike in set theory where the universal set is ill-defined, a universal domain exists that can embed any domain as a subdomain, relying on coercions and retractions. Projections can substitute for them if the universal domain is properly formulated.

Coercion can't properly model polymorphic types (subtyping). Don't know if it can properly model explicit polymorphism (like System F).

MacQueen and Sethi proposed that types should be modeled as weak ideals of a domain.

Walid: from the mathematical point of view, this seems to be what you get. But this explanation is inaccessible to programmers and computer scientists.

William: You can do all this domain theory, but it doesn't really do anything for us.

Ed: There are certain things that are easier to do with operational. Denotational can give you a warm and fuzzy feeling about having intuitive understanding but it doesn't give concrete benefits. If you have beta and delta rules for ML, that's a lot easier to understand than denotational.

Corky: Think: why can't you define parallel or in ML? Denotational gives you a good explanation of this but operational does not.

I (Corky) showed that using intervals instead of ideals gives a better theory.

Cardelli's inhertance construction. Take the domain equation

D = Const + [D x D] + [D + D] + [D -> D] + Error

Cardelli adds Rec[D] (finite records over D) and Variant[D] (variants over D).

William: Does this distinguish subtyping and inheritance?

Corky: Cardelli's approach gives you a framework to talk about inheritance.

Why Cardelli's model is a poor match to OO languages.

  • Class name is stripped; i.e. record types do not have names
  • Variants must have explicit tags but in inheritance you have true unions.
  • Mutation breaks model

Type soundness of nominally typed OO languages are proved operationally because denotational is underdeveloped. Is that approach enough? They give little insight about the type system.

William: Pierce's book talks about "encoding" of FJ which is like denotational semantics.

Corky: I find the codings complex. Doing the denotation of that is an unnecessary level of indirection.

Approach. Construct a domain so that the meaning of every possible programs (and fragments) are built in.

General idea is to have tags that identify the type of the object and all supertypes.

Deploying Functional Code

Speaker: Dwayne Towell

Example:

Section
  time, room, class
Class
  name, hours, list<sections>
Student
  name, list<sections>

Would like to just write

enroll (student, class)
and have that work.

In functional C++ (functional, first-order, type-safe subset of C++):

Student enroll (const Student &s, const Class &c)
{
  const unavail = union (s.sections.time);
  const sect = choose (c.section, unavail);
  return Student (s.name, s.sections.push (sect));
}

Issues:

student.enroll (class);
Doesn't work because we lack mutation.

Object construction, assignment, and destruction kills performance. Modern compilers can handle a lot of cases these days, though.

Introduce a wrapper class that does the mutation.

class Student {
  FunctionalStudent s;
public:
  void entroll (Class c) {s = s.enroll (c);}
}

The FunctionalStudent is relatively easily verified with ACL2. Then the mutation wrapper can be generated automatically, with overhead optimized away.

Integrate syntax for adding assertions to be proved.

Staged Monad Transformers for Fast Modular Interpreters

Speaker: Ed Westbrook, Rice University

Monads increase modularity in interpreters

  • separate features from interpreter proper

Monad transformers separate individual features.

But they're slow. Goal: remove overhead with staging.

Imagine writing an interpreter without monads.

let rec interp t =  (* t is the term *)
  match t with
  | Literal i -> i
  | Add t u -> (interp t) + (interp u)

Say we want to add functions and variables.

let rec interp t env =  (* t is the term *)
  match t with
  | Literal i -> i
  | Add t u -> (interp t env) + (interp u env)
  | Var x -> lookup env x

You need to pass around env everywhere. If you add a reference, you need to pass around heap as well.

This is not modular. For every feature you add, you need to change the whole interpreter; but for most of the match cases, there's a "default" thing you do. We want to abstract away those "default" things. Monads do just that.

A monad is a type "recipe"

  • int m is a type of recipes for ints

Two monad operations

  • ret x
    • "just return x and do the default thing"
    • has type 'a -> 'a m
  • bind r f
    • combining two recipes
    • "perform recipe r and pass result to f"
    • has type 'a m -> ('a -> 'b m) -> 'b m

Monadic interp returns a recipe

  • interp : term -> value m

Still have to run the recipe to get the value.

Monadic interpreter:

let rec interp t env =  (* t is the term *)
  match t with
  | Literal i -> ret i  (* just return i *)
  | Add t u ->
      bind (interp t) (fun x ->
      bind (interp u) (fun y ->
        ret (x + y)))
  | ...

Each monad has its features, which is a special monad.

  • Operation update : * (state -> state) -> state m

let rec interp t env =  (* t is the term *)
  match t with
   ...
  | Ref t -> bind (interp t) (fun x ->
             bind (update (fun h -> ...)))

Monad transformers takes a monad and yields a monad with added features.

Monads add overhead.

  • State monad adds extra function call at each use of `bind'.
  • For transformed monads, bind adds O(#features) function calls.

Multi-stage programming: use abstractions without overhead

  • Manipulate code as values
  • stage 1: create (hopefully more efficient) code
  • stage 2: compile and run it

Staged monad is a recipe where

  • ingredients are code
  • running a staged monad returns code

A staged monad is:

  • the original monad ('a m)
  • a second monad type ('a sm)
    • with operations sret and sbind
  • Staging operations relating them
    • flatten: 'a code sm -> 'a m code
    • unflatten : 'a m code -> 'a code sm (inverse of flatten)
  • flatten is how to run a staged monad

In the interpreter, you just use sret in place of ret and sbind in place of bind. Then you add brackets as necessary.

Staged if expressions with monads is hard.

  • Solve with another operation: local_flatten (see paper)

Results: a modest-size, modular interpreter that has a bunch of features.

Q: What about comparing these to hand-written direct style interpreters?

Ed: not really our focus here. We wanted to see how much overhead could be attributed to the stageable part.

Walid: It would be interesting to see how much performance benefits you get by hand-inlining the monadic operations.

William: I thought you were just removing the overhead due only to the modularity.

Nested Control Regions for Containing Information Flow

Speaker : Andreas Gampe, University of Texas at San Antonio

Focus on compilers.

confidentiality

  • discretionary access control
    • check before operation, e.g. unix ACL
    • what happens after this check? You might just ignore the results of th echeck
  • mandatory access control
    • tag every data element with a label
    • -> space & time overhead
  • static information flow analysis
    • mandatory accss control at compile time

Flows of information that need to be controlled

  • explicit flows like y = x + 2
  • implicit flows like if x == 0 then y = 1 else y = 0 (flow from x to y)
  • timing and termination flows
    • if x == 0 then sleep (100) else exit (can time the program and infer x)

State of the art in CFA is security-type system

  • annotate types with security labels
    • int{secret} x;
  • extend typing rules to check those tags

Goals

  • have a structured intermediate representation
  • linear time, singleass loading & verification

Nested Control Regions

  • Code region (SSA instruction in basic block)
x = param(1)
y = param(2)
z = x + y
  • Conditional regions
  • Exitable regions
  • Restricted goto regions (only forward branching)
  • Loop regions
  • Try-catch regions

Example

while (v < 10)
 v = v + 1

--- gives regin nesting

exitable
{
  loop L (v_phi = phi (v0, v1))
  {
    conditional (v_phi < 10) then
    {
      ...
    }
    else
    {
      ...
    }
  }
}

Dominator approximation (may be imprecise)

  • statement (or program element) A dominates B if A is always executed prior to B if B is executed at all

Extending type system

  • Based on Decentralized Lable Model
  • Labels are organized into a lattice
    • bottom/top labels
    • Policy labels (p = o:r1,...)

Annotating regions

  • Security is basically a data flow problem. Superlinear in the presence of loops
  • For a single-pass solution, use fixpoint soln.

pc labels manage implicit flows

  • pc labels approximate the information learnt at current part of the program
  • prevents the program from writing to "low-grade" variables when use of high-grade data is recorded in pc label

SSA helps with label inference

Open problems

  • class hierarchy + exceptions complicates things
  • current soln: don't throw subclasses
    • hope to find better semantics
  • declassification

Q: is the idea to verify info flow safety like java VM verifies bytecode? A: yes.

Q: is this really the kind of thing the army (for example) wants? A: ...

Q: is the type qualifier work related to yours? A: I haven't read that.

Gel: A generic extensible language

Speaker: Jose Falcon, University of Texas at Austin

Language development was traditionally:

  • stream of text --> specialized parser --> AST

Java and CSS have different syntax rules and you created two separate parsers. You can't interchange the parsers. You can't embed one language into the other, even though their lexical and syntactic conventions are relatively similar.

William: there's been a lot of research on extensible parsers. This is about a new approach.

Generic approach

  • use generic language to encode programs in different languages
  • convert java file -> generic java file; and convert css file -> generic css file -> generic parser

Problem: very verbose

Gel produces human readable generic syntax. It captures a syntactic standards of some languages.

  • Primary, unary, binary operators, grouping constructs, and quotes

The set of operators is not fixed

Gel expression is any combination of operators

Precedence & associativity

  • Require strict precedence
  • Precedence determined by the first character
    • The precedence of each character is prescribed
  • Associativity omitted (i.e. is taken to be associative, not left nor right assoc) when possible, right assoc otherwise

Sequences

  • expressions not separated by an operator
  • f a 3 + g 10 ==> (+ (_ f a 3) (_ g 10))
    • _ denotes sequence, not (necessarily) application
  • must have higher precedence than operators

Spaces

  • Parsers generally ignore whitespace, but Gel doesn't
  • "a + * b" is not interpreted the same as "a +* b"
  • "a + *b" ==> (+ a *[b]) where *[...] means unary prefix
  • "a+ * b" ==> (* [a]+ b) where [...]+ means unary postfix
  • "a + * b" ==> (* (_ a +) b) here + is interpreted as a symbol not an operator

Chunks

  • many situations where sequence has lower precedence than operators
  • operators without whitespace have higher precedence than those with whitespace

Keywords

  • most difficult part of Gel
  • "return x + y" gets parsed by default as "(return x) + y"
  • "return: x + y" gets parsed as expected

With Gel, the parser development becomes like this:

  • stream of text -> GEL parser -> GAST (Gel AST) -> GAST transformer -> AST
  • GAST transformer is written by language developer

Q: do you have any special rules for commas and semicolons? A: they're punctuation.

Q: How easily would you turn this into an ISO standard? This will fit into a generic-precedence operator. How was this implemented? A: We tried a bunch of stuff. Current implementation is in rats.

Q: what does Gel buy you compared to things like Antler? A: This only makes sense when you want to embed things.

Q: how do you evaluate this? This seems to work well but it's ad-hoc. A: It works pretty well for most languages but not for anything.

Slice, Partition, and Reforest for Data Access and Distribution

Speaker: Ali Ibrahim, Texas University at Austin

Service Oriented Architectures

  • Heterogeneous (different languages, different OSes, different CPUs, etc)
  • boundaries defined by interfaces
  • Latency (on the order of milliseconds) is an issue

Example service

  • Remove service that plays music on your speakers.
  • OO design
  • Invoked through RPC

for (Album album : musicService.getAlbums ())
{
   ... // use album over and over
}

The album here is called a proxy . This model hides the network latency and you can incur high performance penalties.

Q: can you instead build a query and send it...? A: that's an important observation

RPC paradigm

  • client stubs forward calls to remote service.

To minimize round-trips:

  • Remote facade: hard-coded composite operations.
  • Data Transfer Objects (DTO): bulk transfer of data.

Problems with these

  • Server design depends on client usage.
  • Duplicates code, hard to evolve.
  • DTO may not preseve identity of server objects.
    • You get an album and you send it to a server. Is that a new album or one that's already in the server? --> work needed to answer this question
    • tradeoff between size of interface and performance

In many ways, DTO is one of those objects our mothers told us never to write.

Remote Batch Invocation

  • have a "batch block" that can contain remote code that gets gathered to the beginning and sent off to the server; the rest is then executed.

RBI Details

  • No proxies. Only primitives can be transferred.
  • One round-trip per lexical batch block.
  • Programmer must consider re-ordering as part of batch semantics.
  • Simple, efficient, but unsound implementation.
  • Compiler analysis is simple local data-flow.
  • Compiler performs best-effort analysis to warn of possible bugs due to reordering.

Batch partitioning is like reforestation

  • To do sum of squares of a list, you can collapse the composition of a sum function and squares function into one function that takes one pass through the list.
  • Batch partitioning reforests the code and fishes out the part that can be shipped off to remote.
  • See paper for more details.

Batching database access

  • Build SQL queries etc. before sending off

Q: Link stuff at Microsoft looks similar. A: This is like a generalization/reification of Link.

RMI creates problems with GC. With proxies removed, GC can kick in in a more timely manner.

Declarative User Interfaces with Property Models

Speaker: Jaakko Jaervi, UT Austin

Motivation: the cancel button grayed out in print dialog (a modal one, too!)

Why is software like this? The dialog is implemented by a bunch of handlers that all modify a shared state. It's too complex for one person to wrap his/her head around.

GUI components are well-tested and reusable, but their compositions are not.

Walid: An orthogonal remark; this is an instance where operational semantically-driven design results in bad shape. With denotationally-driven design, where everything is required to be compositional, this problem would not arise.

We need an understandable model. Incidental structures and algorithms that current GUI toolkits provide do not have such a model.

datapoint: Adobe's desktop applications: 1/3 of code is event handling.

Approach

  • understand the role of a UI
  • define a model that captures commonalities in interfaces

Command only interested in a few values.

Core of the model is a multi-way dataflow constraint system

  • variables tied together by constraints
    • height_abs = height_init * height_relative / 100
  • solution defines a dataflow

Command Parameter Synthesis

  • Each UI element has a variable in a constraint system
  • Event handling code becomes auto-generated boilerplate
    • At all times, UI element shows value under current solution.

Currently programmers must explicitly express when to disable widget

  • The new approach allows to automatically enable/disable according to property models

Current Status and Future Directions

  • Deployment at Adobe
    • Drastic code reductions
    • Drastic decrease in defects
    • Consistent UI

Q: is it possible to combine relationships into one? A: since we treat things as black boxes, they're not invertible. This is necessary to model multi-directional constraints.

Tactic-Driven Synthesis of Global Search Algorithms Based On Constraint Satisfaction

Speaker: Srinivas Nedunuri, University of Texas at Austin

Some programs (planning, scheduling, configuration) are difficult to get correct, efficient, and optimal. "Optimal" here means whether the algorithm finds an optimal solution (or something sufficiently close to it).

Constraint Satisfaction

  • e.g. knapsack, graph problems, integer programming

Maximum Independent Segment Sum (MISS)

  • Given an array of elements, find a subsequence with greatest sum such that no adjacent elements are chosen.

Solving with search.

  • recursive search that incrementally extends partial solutions into full solution
  • a recursive search leads to a search tree

Global search with optimization

  • program schema + operator
    • operators that define the search space
    • operators that control the search
  • e.g. of operators
    • extract, root (space-forming operator)
    • necessary filter (necessary condition for a space to contain feasible solutions)
    • psi (search space -> refined search space)
    • lb (lower bound on the cost of the best solution)

Program Schema for GSO class

  • generic search algorithm
    • specific searches are instantiations of this generic algorithm obtained by replacing the operators with problem-specific ones
    • want to generate good operators automatically

It's not really divide-and-conquer.

Q: You're trying to solve arbitrary constraint problems, or a specialized subclass? An arbitrary problem can be NP-complete and you might as well throw it at a SAT solver. A: The idea is given a classic algorithm problem, like max indep sum or sort, you can phrase that as constraint-satisfaction. Given that spec, you can generate an algorithm specifically for that problem by exploiting characteristics about the constraints. You don't have to verify your resulting algorithm because it's correct by construction.

Specware creates proof obligations for correctness if you need them.

For MISS, * z0 (root) = \lambda x. { m = empty, cs = x.vals } * extract checks that every element has either been chosen or rejected, etc.

This is a correct instantiation but still exponential. -> use dominance relations

  • Compare one partial solution with another to determine if one of them can be discarded
  • semi-congruence

  • weak dominance: a sufficient condition for any feasible completion of z to be better than a corresponding completion of z'

Approach

  • first calculate the semi-congruence relation
  • reconstruct something (didn't understand what)
  • reduces complexity for MISS in particular

Greedy algorithms

  • always makes locally optimal choice
    • weighted matroids, dynamic programming specializing into greedy algorithms, property-based classification, ...

Weighted matroid

  • A triple (S, I, w) with a finite set S, I a set of independent subsets of S, and w a weighting function, with some axioms
  • Greedy algorithm finds the subset I with greatest w value

Proposal of a dominance-based characterization of greedy algorithms

Q: is this going to capture all greedy algorithms. A: probably not, but may capture an interesting class of them.

Greedoids: matroids that weakens the hereditary axiom.

Whitney noticed that the basis of a matrix forms a maximal independent set, like a spanning tree of a graph. Matroid is a generalization of them.

Curtis has a characterization of greedy algorithms, but it's inadequate.

Q: has someone proved the absence of a greedy algorithm for some problems? A: Yes, but if it's NP, people don't bother looking for greedy algorithms.

Calculation of the operators can be non-obvious.

William: We have the generic search algorithm. The goal is to calculate good operators. Dijkstra's weakest precondition stuff from postconditions that we want. This is a human activity and not a mechanical algorithm.

Q: I can't parse the calculation slide. A: It's in the form:

expr1
op  {justification of op} 
expr2
op  {justification of op}
expr3
...
where op can be , implied-by (<), etc.

Q: This is like dynamic algorithms. A: Yes. You're trying to find a good principle that guides the search.

Binary integer programming

  • You'll solve this by incrementally extending the assignment

Tactics

  • provide hints to the developer for calculating the operators, based on the shape of the formula that's used to derive the operators
  • there's a theorem that guarantees that we get a correct phi operator

Hope to form a library of useful tactics.

Comment: The problems that John was talking about yesterday can be seen as binary integer linear programming, so it might be interesting to look at it.

William: if you're interested, there are many papers by Doug Smith on this. This is a highly specialized topic so it may have been better to walk through some basic algorithm stuff...

On isomorphic data type transformations in functional programming

Speaker: Paul Tarau, University of North Texas

Transformations are everywhere

  • compilers
  • complexity class definitions

Shapeshifting between datatypes

  • bijective mappings between datatypes using a strongly typed language (Haskell) as watchdog

Challenge

  • encode everything as everything else --- how?
  • everything as literate Haskell program
    • borrow from combinatorics, category theory, number theory, typed lambda calculus
  • bijective Goedel numberings

Groupoid of isomorphisms

data Iso a b = Iso (a -> b) (b -> a)
from (Iso f _) = f
to (Iso _ g) = g

compose :: Iso a b -> Iso b c -> Iso a c
itself :: Iso a a
invert :: Iso a b -> Iso b a

to :: Iso a b -> (a -> b)
from :: Iso a b -> (b -> a)

borrow :: Iso t s -> (t -> t) -> s -> s
lend :: Iso s t -> (t -> t)

Q: you already made a choice to hard-code borrow2 and borrowN, although you want to push this over products and lists. A: you don't need anything more generic. Q: to be more consistent with categorical... A: I wanted to keep it simple but yes, you can turn it into natural transformation and everything.

Encoder is a connection to Root. Root is a sequence of natural numbers, like a Goedel encoding. Sequence of number is more convenient than one giant number.

Walid: I object to Nat = Integer. By giving an inductive definition, you can generate tests, and it would keep you from doing certain undesirable things.

as :: Encoder a -> Encoder b -> b -> a

Q: This gets you a conversion from a list to some tree but you don't know which one. Don't you care which one?

Example:

as set fun [0,1,0,0,4] ==> [0,2,3,4,9]
as fun set [0,2,3,4,9] ==> [0,1,0,0,4]
The former works by mapping (+1) and then taking partial sums to make it increasing, which represents sets.

The ranking problem for a family of combinatorial objects (e.g. trees, lists) is finding a unique natural number (the rank) associated to it. The inverse is called unranking.

"Combinatorial objects" here includes some second-order things.

Q: what exactly is the rank? The root?

A: ranking is an isomorphism with nat.

Examples:

  • Hereditarily finite sets: use Ackermann's encoding.
  • (Set-theoretic) pairs
  • Directed graphs
  • Hypergraphs
    • do hereditary set construction twice

Simple FPL

  • Proposition: 2^x(2y+1) = z has a unique natural solution (x,y) for any positive integer z.
  • Define a cons operation as "x y -> 2^x * (2*y +1)" and hd, tl as expected..
  • Then the lhs of the proposition is the codomain of cons.

Other uses

  • random test case generation
    • generate a random number, map to structure

http://logic.csci.unt.edu/tarau/research/2008/fISO.zip

Encodings involvng transitivity is difficult.

  • finite posets, finite topologies
  • finite categories

Comment: Exhaustive testing for small values often reveals a lot of important bugs. So if you test them for all lambda terms of depth <= 5, that's useful. But it's hard to ensure that you didn't skip or repeat values. The isomorphism gives that. Same ideas and benefits apply to type system verification.

Q: Can this be used for data compression? A: Certainly for simple types. e.g. tree. But in general that's a separate research topic.

Q: Is it useful to restrict your encoding? e.g. if you have an ordering on the original type. Can I ensure preservation of ordering? A: Sometimes yes, sometimes not (?).

Ideally you'd be able to express that in the type system, but in general it may be a separate proof obligation.

Q: could you embed labels in a graph? A: yes. They're pretty easy. It goes with herediarily finite sets.

Acumen: An Environment for Rapid Prototyping of Cyber-Physical Systems

Speaker: Angela Zhu, Rice University

Applying PL techniques to designing physical device

Cyberphysical systems:

  • Sometimes called embedded systems (computer + physical device)

Design flow

  • design -> simulation -> prototype -> production
  • each step may require restarting the process from design
  • design <-> simulation cycle is the "hot spot"
    • currently also expensive
    • engineer needs to know both physics and programming well

Why simulation is important: bus suspension exmaple

  • We have some DEs describing the dynamic behavior of the system
  • Goal: design an electronic stabilizer for the suspension
  • An algorithm might work well for all frequencies if running on a 100-Hz controller, but may work poorly on a 10-Hz controller.
    • simulation lets you catch this problem early on in the design process at low (monetary) cost

The "Gap": mech engineers use mathematical descriptions, but they're not executable

Acumen = PhyDL + RIDL

  • describe physics in PhyDL
    • notation just like math!
  • device driver written in RIDL

Q: what does "solving DE" entail? It could be numerical with floating-point approximations or with exact real arithmetic, or it could mean symbolic solution.

More on PhyDL Design

  • support both derivatives and integrals
  • support higher-order derivatives
  • "support" meaning you can write them directly in your program
  • implicit-time notation
    • F(t) is written as just F, where t is time.
    • aka point-free style
  • module system

RIDL is basically FRP.

Q: can you take derivatives of arbitrary programs? There are three differentiations: symbolic, numeric, and algorithm differentiation. Are you just doing symbolic derivatives?

A: No, only of variables (idealized, continuous functions). We don't do symbolic differentiation of programs. You could have some discontinuity in those things, but that's orthogonal. The PhyDL program is a model of the physical system. Allowing code differentiation could be a useful thing to do though.

Comment: derivative of expressions can be handled by introducing a temporary variable, either by the implementation or by hand. Not a big deal either way.

Q: do you assume that things on the right are used to compute the left hand side?

A: You mean causal vs acausal. We don't make that assumption (acausal forms are supported).

"Acausal" is also known as an implicit function definition.

Interactive simulation in Acumen

  • Two balls connected by spring-damper
  • Black ball chasing white ball controlled by user
    • Try to drive black ball into a certain pair of boxes alternatingly
  • real-time simulation

Acumen is easy to learn

  • Got group of people from different backgrounds (including non-CS/EE) to use it without much trouble

Numerical accuracy

  • Small simulation errors are OK in practice
  • At least as good as existing system (Simulink) for sufficiently slowly changing systems
    • can withstand scattered instances of quick, large changes

Q: are you doing your own simulation? A: yes.

Q: how do you determine the step size? A: we ask the user.

Q: Is there a way to infer the step size from other information that the user provides? Like the frequency of the controller (for linear systems)? A: That's a good point. That scheme can work. Another possibility is to ask the user for how much error is permissible and adjust step size from it.

Q: Can the step size be dynamically adjusted? A: Not by the current implementation. There's a lot of approaches to do the simulation. We're focused on having the right design for specifying the model.

There is not input to the system. It's a closed system. We model the device, the physics, and that's it. We do have the ability to extract input from external files, but that's just for convenience.

Comment: This is kind of like model checking. At some point you just throw more hardware at it.

Q: You're modeling completely closed system. Who synthesized the controller? A: That's the RIDL part that the engineer writes. That's from a previous work (FRP). The engineer writes the controller separately.

Q: Can you give a constraint and verify that the system satisfies it? Can you let the engineer give constraints/facts as hints to the simulation? A: Future work.

About The Next Meeting

Next meeting scheduled on 9/19 (Sat). The previous meeting was 1-day on Sat. This time was 1.5 day on Sat.

Topics?

Invite more diverse speakers?

Compilers?

We have more people than last time. 50% hit rate this meeting. To spread the the word, we have homepage and mailinglist. This is working well.

Let people know about this meeting whenever you talk to people working in relevant field.

a lot more students presented than last time. wonderful.

Ed: this was a good opportunity to get practice. was very unintimidating to present here.

geographical disadvantage.

restricted only to Texas universities? not really. anybody who feels close enough. it's a trip for someone in new Mexico though. relatively sparse compared to new England

UT was pretty good in terms of centrality. Rice and Abilene both 3 hour drives. moving around important.

Oklahoma might be next one. we contacted Dallas but no one showed up so far.

2-day gives you chances for evening events.

if you have any other comments, feel free to email William or Walid personally. we can put them up as anonymous feedback.

how many people not gone to other CS conference before? 4. how many not presented before at a CS conference? 6. how many presented only once before at a CS conference or workshop? 4. that's more than 1/3. this count will give us a way to assess the contribution of this meeting.

Everyone send slides to William. IN PDF. PowerPoint not allowed of course. If using Windows, try CutePDF.

-- JunInoue - 29 Apr 2009

Topic revision: r3 - 29 Apr 2009 - 02:09:11 - JunInoue