K: We need to talk about group

112 points
1/20/1970
a year ago
by chrispsn

Comments


pavlov

It’s awe-inspiring to see a K program where the comments are in another APL variant (I can’t pretend to know which):

  {d:~1=':s:^x             / s‿e←1⊸»⊸(>⋈<)' '=𝕩
   c:^"aeiou"?_x@&d&~s     / c←¬(s/𝕩)∊"aeiou"
   x,:,/$`pig`dog c        / ins←⥊c⊏["pig","dog"]
   x@<(+\d),{3}#c+2*!#c}   / ((+`s+e)∾3/c+2×↕≠c) ⍋⊸⊏ 𝕩∾ins
This program implements a Pigdog Latin translator, bien sûr.
a year ago

krick

Amazingly, APL is more readable. Actually, it's even pretty intuitive. Which proves again, that restricting code to ASCII when every competent programmer (and PC-user in general) can enter almost any unicode symbols just fine is simply stupid.

a year ago

habitue

You can enter any symbol but not quickly unless you set up some hotkeys or something. There is a key on the keyboard for e. Whatever you have to do for a unicode set symbol, it's going to be harder than hitting the e key.

And with regards to hotkeys, let's say you do Ctrl+alt+shift+esc+e to insert a unicode symbol, is that really better than just having a programming language where the common hotkeys are spelled out with a few ASCII characters?

a year ago

magicalhippo

> There is a key on the keyboard for e. Whatever you have to do for a unicode set symbol, it's going to be harder than hitting the e key.

Seems https://fluxkeyboard.com/ might be a great fit for programming in APL and similar.

a year ago

Avshalom

I have the apl keyboard set to super so ∊ is exactly as hard as E.

a year ago

habitue

That's the reason the apl keyboard is good!

a year ago

scrawl

i don't think it's stupid. i also find k more readable. 2 reasons why:

- k has much fewer primitives. it's easier to remember a smaller set of operations.

- k is statically parsable. APL is not. we know the program structure simply by reading the code.

this is completely subjective. because you find APL more intuitive doesn't prove anything. to each their own.

a year ago

wruza

←⥊c⊏∾×

Yeah, I’d love to discuss with colleagues how square-c works together with right double crowbar.

No, thanks.

a year ago

pavlov

If you see the following:

x = √y

… would you pronounce it as “x, two parallel horizontal lines, V with a left hook, y”?

The symbols have a meaning that’s unrelated to their appearance. It’s the same in APL. The “square c” is just like “V with a left hook”.

a year ago

wruza

And octothorp (#) is “eight-something”. Circumflex (^) is “bent around”. Names of most characters only sound academic because we don’t know Latin and Greek too good.

“√” isn’t even a natural character, it’s a graphical delineator between an index and a radicand, similar to ÷ or % forms of division. By “obelus” (sharpened stick) do you mean dot above and below a line, a percent, a dagger or just a line?

Sure there will be enough confusion with ~10-30 extra symbols that aren’t even on a keyboard and may not have a single meaning or a name.

a year ago

jazzyjackson

???

The point is, within the programming language, they do have a single meaning and a a name.

a year ago

draw_down

[dead]

a year ago

adregan

a year ago

tempaway45722

[flagged]

a year ago

warent

this looks like some compiled program's binary viewed as unicode WTF

a year ago

semi-extrinsic

Isn't the original APL the only lang that uses actual Unicode symbols for operators?

a year ago

Jasper_

APL predates Unicode! You used to have to use special APL keyboards to write it. The symbols were included in Unicode as a way to try and make APL more palatable.

a year ago

jdlshore

> The symbols were included in Unicode as a way to try and make APL more palatable.

That seems highly unlikely. Do you have a source, or are you sharing speculation as fact?

It’s more likely that APL symbols were included in Unicode as part of its mission to unify all the world’s scripts into single code set.

a year ago

shrubble

They weren't included to make it more palatable, but because APL was being actively used in larger systems.

a year ago

ksherlock

There are esoteric code golf languages that use unicode. MPW (shell, assembler) used extended characters (MacRoman, not unicode but then again, APL also predates unicode) for some operations.

eg, instead of "echo yo >> file" you would "echo yo ≥ file"

a year ago

messe

Julia has support for Unicode operators. Most editor plugins for Julia allow you to enter the characters using (La)Tex syntax, so it’s quite common to see.

a year ago

sli

Unicode support doesn't really seem that uncommon, in the grand scheme. Rosetta Code's wiki has a page for languages that support Unicode variable names[0] and some of them were surprising to me (e.g. AppleScript).

Of course, supporting Unicode variable names doesn't mean a language supports Unicode anywhere, but it's a starting point for historical research at least.

[0]: https://rosettacode.org/wiki/Unicode_variable_names

a year ago

jlg23

APL predates the first documented thoughts (not specification/implementation) of unicode by roughly 20 years.

a year ago

bbarnett

APL predates the first documented thoughts

I used to think that Lisp was bad, with its "God coded the universe in Lisp", then I read only the first part of your sentence, and was "Woha, so APL predates thought?!"

https://www.youtube.com/watch?v=5-OjTPj7K54

a year ago

pavlov

There’s also at least BQN, which I suspect is the language used in those comments:

https://mlochbaum.github.io/BQN/

a year ago

chrispsn

Yep, the comments are the BQN version.

a year ago

dan-robertson

Raku (fka perl6) has some, eg you can use <atom emoji>+= to atomically increment a value.

a year ago

mostlylurks

Haskell allows defining unicode operators, and has an extension that allows you to use unicode symbols instead of ascii sequences in various parts of the syntax.

a year ago

remexre

Agda does too! https://plfa.github.io/ for examples

a year ago

lizmat

No.

In the Raku Programming Language there are several non-ASCII operators, although each of them has a pure ASCII equivalent. Some examples: ≤ vs <=, ≠ vs !=, ∈ vs (elem), etc.

The full list: https://docs.raku.org/language/unicode_ascii.html

a year ago

Avshalom

this is a bit obscure but Trealla Prolog allows for full unicode in the source and user defined operators, which is nice.

a year ago

thechao

Fortress.

a year ago

qsort

I... just can't. I work a lot with data, so it sounds like it should be a good fit for what I do, but I find that writing the damn code in a normal language is much more productive. Maybe it's my brain that's wired wrong, but I fail to see how the entire construction is supposed to help.

a year ago

mandevil

A friend worked on a project in 2005 that had lots of K code in production. They had a contract with Kx Systems and Arthur Whitney in particular because, in my friend's opinion, only Whitney could really understand K code well enough to debug it. Friend's description was it took my friend two days just to comment the code into something grokkable by a normal developer, whereas AW didn't need to do that. Of course, it took Whitney those same two days of staring at the code before he said "Oh, of course, how silly of me." and found the bug and all resolved.

Part of those two days of commenting was that since K is interpreted all of the developers would hand-obfuscate all of the code so that the variables were a, b, c, etc. so that they would be slightly faster to parse than a multi-character string. This sort of thing is done all the time now, with JS to make it load faster, but his team at least didn't have any scripts to automatically turn developer-friendly code into interpreter-friendly code, they worked with the interpreter-friendly code version.

a year ago

twoodfin

In my experience this practice isn’t obfuscation or for performance. The whole point of array processing languages is to amortize any fixed operation cost over the span of the array.

The notation-as-a-tool-of-thought camp prefers short identifiers on the principle that the more concise the expression, the easier it is to comprehend in total. This is not at all dissimilar from the general practices of mathematics.

a year ago

tluyben2

Once you get used to it, it's just faster and stop noticing the 'oh line noise' thing. But like learning a natural language, you need to get to the point where you stop translating in your head and you think in the operators. Once you reach that, it is very hard to go back.

a year ago

therein

And Whitney is into being able to see the whole code at a single glance as far as I know.

a year ago

chongli

I think the main use for K and the other APL family languages is programmer lock-in. If you can get your employer to buy in and write the entire codebase in one of these languages, you can establish yourself as essentially irreplaceable and then demand large raises to retain you.

a year ago

Avshalom

what was your first programming language?

mine was matlab. thinking of everything as an array, especially things that aren't arrays, was the first pattern I ever learned. I've never used APL in anger but for the most part it comes very easily to me. Maybe not the code golfing and/or idiomatic APL style that people like to post but, like, using the primitives to create increasingly elaborate arrays before discarding anything I don't want, that part has always felt very straight forward.

a year ago

tluyben2

How long did you try? And what languages do you use?

a year ago

userbinator

Chinese is a "normal language" to over a billion people.

a year ago

namuol

Most of the time when you open a PL article on Wikipedia you're greeted with a Hello World example, but here you get this:

"""

[...] As a result, K expressions can be opaque and difficult to parse for humans. For example, in the following contrived expression the exclamation point ! refers to three distinct functions:

    2!!7!4
Reading from right to left the first ! is modulo division that is performed on 7 and 4 resulting in 3. The next ! is enumeration and lists the integers less than 3, resulting in the list 0 1 2. The final ! is rotation where the list on the right is rotated two times to the left producing the final result of 2 0 1.

"""

a year ago

sergiotapia

That has to be the tersest language I have ever seen in my life. Code golf but it's actually a real language used heavily in finance.

a year ago

dan-robertson

I think ‘used heavily’ is a bit of an overstatement.

I think the right intuition for APL-family languages is that they mostly do numpy-like operations except their set of operations tend to compose very nicely. So the idea is that one can quickly and interactively figure out a composition of the operations which will do the calculation you want, but you don’t have a complicated compiler and spend most of your time in the operators rather than the interpreted language, and the operators tend to make the cpu happy – they work on contiguous memory, tend to be vectorized, don’t branch unpredictably, etc – so even if you have to compose many steps, you win on the time to write and the time to execute can be hard to beat because the constant factors of each individual operation are good.

For obvious reasons, things like numpy, pandas, dplyr, etc are more popular as their syntax is a bit more readable and it is easier to get data in/out. I think they do lose a bit by not having lots of the useful compostable APL-style operators because those things don’t have comprehensible names.

a year ago

eismcc

I wrote http://KlongPy.org to blend Klong and python. Eat your cake and have it too :)

a year ago

rippercushions

> compostable APL-style operators

APL-style operators have been sent to the compost heap of history for a reason.

a year ago

dan-robertson

Fold

Scan

Map

Head/tail

Index (with an array)

The integers up to n

Concatenate

Flatten

Most of the operators are already familiar but they tend to work well on multidimensional arrays and with each other due to broadcast semantics (and some axis selection features). In other languages you may see the magic broadcasting for some operations like addition but many of the functions only work on vectors.

a year ago

therein

Some exchanges like BitMEX are written in kdb+, as in, the trade engine.

a year ago

dan-robertson

What is a trade engine?

a year ago

mandevil

The actual thing that lines up a series of sell-at-this-price bids and the series of buy-at-this-price bids (or even more complicated types of orders, this the easy case) and crosses them over, figuring out who gets what bids filled by whom at what price, at supremely fast speeds and large volumes.

a year ago

dan-robertson

I think I’ve only heard of such a component referred to as a matching engine.

a year ago

therein

I've heard it both ways. Now you have as well, so next time you see it, you won't need someone to explain it to you.

a year ago

dan-robertson

I did try looking it up fwiw. That was my guess but I was wondering if you meant something else and didn’t want to jump to conclusions.

I was surprised by the suggestion that kdb+ was used for this because a matching engine doesn’t really sound like the kind of case that APL-family languages would be well suited to – they tend to work well on large batches of data but for a matching engine, you have messages that come in over time and you want to process each message with reliable low latency (if you have higher latencies you will get wider spreads which is a competitive risk and leads to collecting less in fees), which means not operating on batches.

Do you have a source for the claim that kdb+ is used for the matching engine?

a year ago

therein

I worked on that codebase personally but here is a write-up on it:

https://blog.bitmex.com/bitmex-technology-scaling-part-1/

https://kx.com/news/kdb-powers-trading-platform-bitmex-high-...

https://www.odbms.org/2017/09/use-case-kdb-integral-to-bitme...

It really works, and also the codebase is quite mindbending. There is a weird elegance to it but also huge lock in.

a year ago

ldayley

For context: This is regarding the function of "group" as it is implemented in the K array-oriented programming language (dialect of APL).

a year ago

anonu

So the summary of that long post: we still need group but Shakti/k9 gets rid of it because it's too slow forcing the user to diy it with other primitives as needed?

I love it... It's the exact opposite of every other language design. Others: lets ship it with every tool a developer might ask for. K9: Occam's razor to everything. Nothing is safe.

a year ago

r9550684

there's a number of visionaries for lack of better term who have been following similar paths in their respective schools. chuck moore after FORTH got standardized went back to drawing board and made colorForth, where he reduced the number of available words to bare essential, very similar to how Arthur is rethinking and removing elements of apl in K, and then K in Shakti to what he sees as bare essentials. but you have less extreme cases, like Wirth developing pascal, then more capable modula, but then removing even arguable useful parts of modula in oberon, for being non-essential. I'm pretty sure there are other examples, but they like the examples I have given have small dedicated followings, rather than wide industry adoption, for reasons that wide industry adopted technologies tend to be all things for all people as a direct cause and effect of the wide adoption.

a year ago

chrispsn

I think we need some kind of grouping function, but it doesn't have to be the 'generate group indices' function. I'm sure we'll have alternatives available such as `update ... by`.

a year ago

idle_zealot

These line-noise programming languages read like a cruel joke on engineers in the finance world.

a year ago

CraigRo

MS has a 500+ person mailing list for peer help on kdb/q/whatever. They also have a site license, and herds of consultants from first derivatives.

It is much easier to become good at k/q if you have experts sitting next to you. Learning this at home (or reading about it on hn) can be very frustrating!

a year ago

co_dh

I am happily programming in a for 6 years now, and will not give it up

a year ago

pwdisswordfishc

Happily programming in a what?

a year ago

co_dh

In q/kdb. iPhone replaced q with a. Sorry

a year ago

qznc

Arthur Whitney (creator of many APL dialects) created A, then Morgan Stanley extended it into A+.

So maybe simply „a“ was the name.

a year ago

anigbrowl

I like how clever the language is, but it seems only slightly less hostile than Brainfuck. Guess I'll stick with being slow but comprehensible.

a year ago

nextaccountic

What is group?

a year ago

gdprrrr

A group is a non-empty set and an operation that combines any two elements of the set to produce a third element of the set, in such a way that the operation is associative, an identity element exists and every element has an inverse. /s

a year ago

geoah

Some context would be really nice indeed.

a year ago

Jtsummers

https://gist.github.com/chrispsn/3450fe6172a7cc441d0819379ed... <- Here you go, some context about what group is and does (or was and did, in this case) in the K language.

a year ago

chrispsn

Good point - added a simple explanation in the article:

Group tells you the places each element occurs in a list. It generates lists of indices.

a year ago

nextaccountic

Nice! Sounds useful for scatter and gather algorithms

https://en.wikipedia.org/wiki/Gather/scatter_(vector_address...

a year ago

jpf0

Group is (was) a function in various array languages, including K, APL, BQN, J, and Shakti.

a year ago

co_dh

Group is actually like inverse a mapping. Given an array A which map index to value, group A map value to index. A common idiom is: desc count each group A , which list most frequent element first

a year ago

ape4

The Linux /etc/group file was my first and incorrect thought.

a year ago

[deleted]
a year ago

anyfoo

From the way I often use Haskell, Matlab, and even crazy shell one-lines with lots of pipes, awk, sort, and other stuff, I often have the impression that array languages are exactly what I'm looking for for certain problems.

The question is... which array language should I pick? What are the reasons to pick one over the other?

a year ago

Avshalom

J i guess. https://www.jsoftware.com/#/

I don't really like J but the only serious APL left is the proprietary (though free, and fun) Dyalog; K, the official product, is also proprietary and because Arthur rewrites it every couple of years all the open source clones are of different versions so there isn't much of an ecosystem.

a year ago

anyfoo

Hmm, so why J over Dyalog APL then?

a year ago

Avshalom

it's libre and gratis, if that's not something that bothers you (and to be sure if you're just writing code for your own edification it's not a practical problem) then i would recommend dyalog, although as a side effect of being open J has a (perhaps only slightly) larger free ecosystem and community.

the down side is that subjectively some people (myself included) find J uglier, less effective as a tool of thought and due to some syntactic features (hooks and forks, but no proper lightweight anonymous function syntax)-prone to people posting impenetrably dense code as "examples"

a year ago

Qem

Do you know if it's available from GNU/Linux repositories? One thing that bothers me with single letter language names is that it makes searching information cumbersome, due to lots of noise in search results.

a year ago

anyfoo

I see, thanks! Looks like I might try both. The "no proper lightweight anonymous function syntax" sounds weird to me, I would have thought that's paramount for an array language? Do I misunderstand?

a year ago

Avshalom

In J if you want to do something fairly complex without writing a function you can pile up a bunch of verbs/operators into a big (point free) verb-train. You can't do recursion, loops or explicit flow control but in the Array Languages that's not much of a handicap, it can get pretty hairy to read though

In dyalog you can (as of about v13) do that if you want, or you can just use curly brackets { and inside of them have lexically scoped variables, multiple expressions, recursion and if/then (no loops either though) }

a year ago

jpf0

J recursion is $: You can do flow control in both scripting style and array style. Loops are loops.

a year ago

Avshalom

is $ meaningful in verb trains? because that was what I was referring to when I said no loops/recursion/flowcontrol

N.B. a sibling says J has added a direct definition construct while I wasn't watching which renders my comment largely irrelevant although the I feel general point that a lot of J 'example code' tends towards difficult-for-noobs to parse verb trains still holds.

a year ago

jpf0

Yes. this one is fun. Recursive, memoized Fibonacci, the 155th integer precisely.

{{(-&2 +&$: -&1) ^: (1&<) M. y}} 155x

It'll run in your browser in 0.003 seconds.

https://jsoftware.github.io/j-playground/bin/html2/#code=%7B...

a year ago

moonchild

This was true historically, but recent versions of j introduced a 'direct definition' syntax similar to dyalog's dfns.

a year ago

scrawl

i love k. it's a much smaller and regular language than APL and some of its derivatives. you can look at John Earnest's oK. it's fantastically documented and a great learning resource.

i would also recommend BQN. it has an active community and it's designer Marshall Lochbaum explicitly tried to address some of the warts in APL and j. he's done a great job.

learning any of the array languages will be a tremendous learning experience if you haven't approached the paradigm before

a year ago

jim-jim-jim

I had a similar feeling, but ultimately found Haskell more maintainable. If you write it point-free with well chosen symbolic operators, it winds up being the best of both worlds.

Not that I have any firsthand experience with the matter, but I think Morgan Stanley has been switching over from K to some Haskell-like dialect as well.

a year ago

co_dh

Q/kdb, it is practical, and you can find a good pay job with it.

a year ago

msla

Imagine what APL could do with a Haskell-style type system and enforced referential transparency.

APL programmers want to say their language is mathematical notation, well, make it mathematical notation.

a year ago

chrispsn

In the Wordle dict example, it would have helped if a type system could tell me I was wrong when I assumed that an indexing miss would generate an empty list.

a year ago

natas

Arthur will address this in k10 :)

a year ago

KrugerDunnings

is there any way i can try this out?

a year ago

chrispsn

Yep - added a link to some simple examples in the article (below). Throughout the article there are also a few links to executable versions of the code examples.

https://ngn.codeberg.page/k/#eJxdjssKwjAQRffzFddVFyLFoiiFQD8...

a year ago

eggy

Or you could try J. Another terse array language that uses ASCII symbols.

https://www.jsoftware.com/#/

a year ago

maximus-decimus

ngn k : https://codeberg.org/ngn/k

kona : https://github.com/kevinlawler/kona

alto a javascript implementation even with a graphics library : https://github.com/JohnEarnest/ok

edit : assuming you were talking about k and not freq

a year ago

yakubin

OMG: <https://codeberg.org/ngn/k/src/branch/master/a.c>

My maths notebook from high school looked like that. I'm having flashbacks. (At least in text files you can't cram 4 lines of text into 1.)

a year ago

semi-extrinsic

Ah, look at that ngn k link. They even golf the file names of the source code of the interpreter itself :D

a year ago

tluyben2

Not golf; that’s the norm. Makes things easier to fit a page without scrolling and looks/feels more like programming in k.

https://code.jsoftware.com/wiki/Essays/Incunabulum

a year ago