Pairing With GPT-4

267 points
1/20/1970
a year ago
by vinnyglennon

Comments


logankeenan

This is similar to how I’ve been using ChatGPT. If you know what you need to do and how to do it, then you can tell ChatGPT what to do in a descriptive prompt. It gets you there 75% of the way and then you need to tweak things or ask it to refactor. For me, it’s removed a lot of grunt work and I can focus on the problem rather than coding.

It reminds me of working with a junior developer, you need to understand how to communicate with them effectively.

a year ago

louiskw

> If you know what you need to do and how to do it

Having used it for the last week or so to help me translate Python -> Rust, I'd say GPT4 is on par with even senior engineers for anything built in. The key areas it falls down on are anything involving org/repo knowledge or open source where things can be out of date.

There's a few teams working on intelligent org/repo/opensource context fetching, which I think will lead to another 10X improvement.

a year ago

bradgessler

This is a use case I could see GPT really excelling—porting libraries from one programming language to another. How awesome would that be? Newish ecosystems like Rust, Crystal, Elixir, etc. could have a larger set of libraries in a shorter amount of time.

There’s still a lot of need for a human to be involved, buts it’s not hard to see how this will make individual developers more productive, especially as LLMs mature, plug themselves into interpreters, and become more capable.

a year ago

rmak

By 2025, the entire WordPress library, including popular plugins, will be written in Rust/C, and it's inevitable that we'll be converging towards assembly language for all software, including web applications.

a year ago

melvinmelih

Indeed - why stop at human-readable code? At some point in time we will be solely reliant on AI to understand our own codebase.

a year ago

pmoriarty

And after that we'll just stop being able to understand any code whatsoever, since it'll be written by AIs in a way no human could ever understand.

Not to mention that humans will stop learning how to program, since they won't need to as the AIs will be doing all the coding for them.

There'll be a general de-skilling of humans, as they offload the doing of virtually everything to robots and AIs.

Then Idiocracy will have truly arrived, only it'll be even worse.

a year ago

TheHappyOddish

I for one love Brawndo!

a year ago

mgoetzke

It's got electrolytes !

a year ago

akiselev

Anyone remember Dreamweaver or Microsoft Word HTML output? I imagine the AI generated code will be a lot like that - massively over-constrained and over-engineered to fit the impedance mismatch of a WYSIWYG (what you speak is what you get) editor.

You thought GUIs were bloated and slow before? Just you wait!

a year ago

int_19h

I've been experimenting with GPT-4 to see how it could be integrated in video games for NPC decision making. After trying a few different approaches to give it access to the world state for reasoning, I found that it seems to do best when given a bunch of table definitions and told to write SQL queries against them; for some reason, it navigates object graphs much worse. Here's an example schema and session (the game here is Mount & Blade 2: Bannerlord, which is convenient because it's written in C# and it's easy to access and dump the entire game state in whatever format):

https://gist.github.com/int19h/4f5b98bcb9fab124d308efc19e530...

Take a close look at the SQL queries it generated to answer the question. On one hand, it's very impressive that it can write a complex query like this as part of its reasoning in response to a question like "pick a bride for your son":

   SELECT name, age, clan_name FROM (SELECT * FROM people WHERE name IN ("Mesui", "Abagai", "Yana", "Korte", "Bolat", "Chagun", "Alijin", "Unagen", "Sechen")) AS potential_brides INNER JOIN clans ON potential_brides.clan_name = clans.name WHERE clans.tier = (SELECT MAX(tier) FROM clans WHERE kingdom_name = (SELECT kingdom_name FROM clans WHERE name="Oburit"))
On the other hand, this is clearly far from the optimal way to write such a query. But thing is - it works, and if there's no way to get it to write more performant queries, the problem can be "solved" by throwing more hardware at it.

So, yes, I think you're right. The more code out there will be written by AI, the more it will look like this - working, but suboptimal and harder to debug.

a year ago

chaxor

I'm always baffled that people complain that the code provided is not 'performant enough' or 'is too verbose', etc. It's usually providing a solution that is the most typical solution given by all programmers - but the beautiful thing about it is - you can ask it to adjust the code to match a given constraint. Just like your colleague. Only usually far, far better honestly - because you get a response in about 3 seconds, and iterate to the answer in about a total of t minutes, rather than receiving something (maybe it's good, maybe it's entirely missing the constraints) .. in 4 days.

a year ago

akhosravian

Is there something in particular that leads you to that belief?

a year ago

[deleted]
a year ago

resters

I thought that at first.

I think with LLMs we will start to see code with better and more descriptive comments, since there is significant advantage to be found when the code has as much information and context as possible.

Human programmers often hoard that for job security or out of laziness.

a year ago

weatherlite

I don't share the enthusaism since it probably means humans won't be looking at code much anymore, at least not as a paid job.

a year ago

chaxor

I am starting to think of this more in the sense that 'not many people write in assembly anymore' or 'some people write C - sure - but writing in Nim can be a nicer experience and can take care of some of the details for me' and it has been helpful for me to come to terms with the direction things are moving. Granted, this is a big step to higher level "programming" (and those quotes are now bigger than ever) but, as pointed out, these LLMs are even better when you know how to use the linguistic interface like a senior dev rather than a person off the street. Effectively, now devs can now work to make decisions closer to the C-suite by framing the problem and constraints, and spend less time on the frustrating errors while developing.

a year ago

reidjs

The goal shouldn’t be to code, but to create useful software.

a year ago

int_19h

The real concern is that code will become too incomprehensible to be debugged by most humans before we solve hallucinations etc (if they even are solvable) leading to bugs.

a year ago

logankeenan

> I'd say GPT4 is on par with even senior engineers

I agree, I avoided saying that in my original comment to avoid nit picking. The depth of knowledge is incredible. It’ll just make junior level mistakes while writing senior level code.

> The key areas it falls down on are anything involving org/repo knowledge or open source where things can be out of date.

With GPT-4, I’ll paste snippets of code after the prompt so it has that context or even documentation of the API I need it to work with. The web client is still limited in the amount of tokens it’ll accept in the prompt, but that is expected to increase at some point

a year ago

IanCal

Also the setup described in the post is like pairing with someone that can't run the code and is writing in notepad.

More intelligently putting it in a loop & other helpers for it (which we're now seeing worked on more) should massively improve performance.

a year ago

f0e4c2f7

This is also available with chatgpt plugins in alpha, one of the plugins is websearch. It works pretty well in bing which also allows this and is reportedly based on gpt4. Short of that, another thing I've found to work pretty well is pasting in the api docs for the relevant api at the beginning of the prompt and explaining that is the more updated info.

a year ago

camillomiller

Same experience. For me the golden standard for a problem I can conceptualize but I can’t write code for is to start a prompt with “how would you do this with —given language”?

That has taught me some things I didn’t know and helped me formalize and understand concepts better (I think).

Also, on a more funny note, it’s the best ffmpeg interface I’ve ever used :)

a year ago

moffkalast

Yeah it's starting to remind me of the Star Trek holodeck interface. You still need to know exactly what needs to be done, but it can handle the boilerplate and a rough first implementation, then I can spend my time tweaking it a bit to get exactly what's needed. Probably saved me a whole day of work on a project yesterday.

a year ago

spaceman_2020

For some problems, somehow GPT-3.5 is better than GPT-4.

I asked it to help me extract certain values from a spreadsheet. It gave me a complicated and overly long formula using FIND and MID that didn't even work.

GPT-3.5, otoh, gave me a neat little REGEXEXTRACT formula that worked perfectly.

When I pointed out to GPT-4 that it can just use regex instead, it "apologized" and again rewrote a formula that didn't work.

a year ago

spookthesunset

Yes. I love the idea of using it for ffmpeg!

a year ago

dimal

Same. It’s like working with an incredibly smart junior engineer who doesn’t think things through, sometimes solves problems that don’t exist, and then when you ask for a minor change, inexplicably rewrites everything to be “better”. I haven’t gotten great results yet, but it feels like I just have to figure out how to ask questions correctly so I can lead it down the right path. Seems like I’ve been asking it to solve problems that are too large, and once there’s an interaction of two or more functions, it goes off the rails. But if I define the overall structure and the interfaces, then I have better luck getting it to fill in the functions. Our jobs are safe, for now.

a year ago

brundolf

Maybe I should just try it, but I don't feel like there are many situations where the process of "come up with a way to describe exactly what I want, generate it, and then debug the output until it works" would be faster than "just write it myself". And most of the ones I can think of would be in a situation where I'm working in an unfamiliar language/framework, and it's less about not having to type out the code and more about not having to context-switch to reference the docs for something I don't know off the top of my head

Which- maybe that's just the thing, maybe that's what people are using this for right now

(This could change once it gets to the point of stamping out entire projects, but for now)

a year ago

CookieCrisp

If you haven't tried it out, I recommend you do. I've been blown away with how much it will just get right. I started out giving it small tasks, but, I have continually been in a loop of "oh, I could have just asked it to do this larger task that contained the subtask I asked for, and it would have done it", because I keep thinking "surely it wont be able to do the whole thing", and each iteration I am asking it for more and more.

As far as debugging, I find that I can just say "are there any bugs in that code?" and it does a great job at finding them if they're there (even though it's what it just gave me)

a year ago

montecarl

I'll give two examples of code I could for sure write myself but that I had gpt4 write for me because I figured it would be faster (and it was).

The first one was to write a python script to watch a list of files given as cmd line arguments and plot their output anytime one of the files changed. It wrote a 100 line python script with nice argument parsing to include several options (like title and axes labels). It has one tiny bug in it that took a couple of minutes to fix. When I pointed out the bug it was able to fix it itself (had to do with comparing relative to abs file paths). If I wrote the script myself I would not have made something general purpose and it would have taken maybe 30 minutes to do.

The second example required no fixing and appears bug free. I asked it to write a python function to take a time trace, calculate and plot the FFT, then apply FFT low pass filtering, and then also plot the filtered time signal vs the original. This is all straight forward numpy code but I don't work with FFTs often and would have had to lookup a bunch of different API docs. Way faster.

I have also had it write some c macros for me, since complex C macros can be hard to escape properly and I'm not comfortable with the syntax. Its 100% successful there.

a year ago

brundolf

Yeah, so I think all of these fall under that category I mentioned (the first one may or may not for you, but it would for me because I don't work with Python much): most of the time saved is in looking up a bunch of things to assemble together, not ideating and writing out the logic. I can see this kind of codegen being useful in special situations, but it wouldn't really be useful in my day-to-day work because almost by definition I'm very familiar with the technologies I use every day

a year ago

ben_w

> come up with a way to describe exactly what I want

This is the thing: you don't need to be exact. Even very vague gets a long way.

a year ago

[deleted]
a year ago

avereveard

yeah people are scratching the surface of the potential.

you can write prompt to clean unstructured text into structured text. that becomes the work area.

instead of asking gpt "write me some code in pyton that does x" you can just tell gtp "do x with the structured data in the work area and present a structured result"

gpt becomes the computation engine

granted it's super expensive and quite slow, today

but with a few plugins to take care of mechanical things and the token memory space becoming increasingly bigger, writing the programming step directly on there may be the standard. as soon as vector storage and reasoning steps gets well integrated, it's going explode.

a year ago

[deleted]
a year ago

ZitchDog

I doubt it will be the standard. I think it will always be faster to process data using a script generated by the LLM.

One thing I could see is a "self updating" script, that adjusts its logic on the fly when new data are found.

a year ago

avereveard

on one hand it's also explotaible as hell via prompt injection, on the other hand for certain tasks is more robust because it works off the intention you give it, not the instructions, so it can get back on track.

a year ago

cameronfraser

Exactly! I've been telling people for a bit, it's like having a junior dev on the team that knows a lot of weird stuff

a year ago

TheHappyOddish

That's precisely how I frame it to people.

I'm a member of the management team for a company, and everyone was very excited for ChatGPT. As someone with a technical background, the caveats were a lot more obvious to me from the start. Treating it as an incredibly fast, eager junior developer who is desperate to please and submits code without testing it has made it a lot more usable.

a year ago

js2

The other day I made a change to one of our iOS builds such that the ipa (a zip file) now contained a subdirectory called "Symbols/". I wanted to know how much space the compressed symbols grew the overall ipa by. I immediately typed this on the command line:

    zipinfo -l foo.ipa "Symbols/*" | awk '{t+=$6}END{print t/1024/1024}'
I work with developers who are not CLI-fluent and wondered what they'd do. And then I realized, they might ask ChatGPT. So I asked it: "How do I figure out how much space a subdirectory of files takes up within a zip file?"

It took a lot of prompting to guide it to the correct answer:

https://imgur.com/a/kAaMova

The weird part to me is how sure it is of itself even when it's completely wrong. (Not too different from humans I work with, self included, but still...) e.g.

> "The reason I moved the subdir filtering to 'awk' is that the 'zipinfo' command doesn't support filtering by subdirectory like the 'unzip' command does."

Where does it come up with that assertion and why is it so sure? Then you just tell it it's wrong and it apologizes and moves on.

Now, the addendum here is that if either of us had read the zipinfo man page more closely, we would've realized it's even simpler:

   zipinfo -lt foo.ipa "Symbols/*" | tail -1
---

I also tried to use it to translate about 10 lines of Ruby that was just mapping a bit of json from one shape to another to jq. It really went off into the weeds on that one and initially told me it was impossible because the Ruby was opening a file and jq has no I/O. So I told it to ignore that part and assume stdin and stdout and it got closer, but still came up with something pretty weird. The core of the jq was pretty close, but for example, it did this to iterate over an object's values:

    to_entries | .value.foo
Instead of just using the object value iterator:

    [].foo
---

One of my pet peeves is developers who never bother to learn idiomatic code for a particular language, and both of these turn out to be that sort of thing. I guess it's no worse than folks who copy/paste from StackOverflow, but I kind of hoped ChatGPT would generate things that are closer to idiomatic and more based on reference documentation but I haven't seen that to be the case yet.

---

Has anyone here tried using it for code reviews instead of writing code in the first place? How does it do with that?

a year ago

adwf

I think part of the problem is that they've wrapped all the responses in a positive language framing to make it feel more human.

It's always "Sure thing!" or "Yeah I can do that!", and it responds so quickly that you equate it to when a human responds instantly with a strong affirmative - you assume they're really confident and you're getting the right result.

If it took 5-10s on each query, had no positive framing and provided some sort of confidence score for each answer, I think we'd have different expectations of what it can do.

a year ago

quickthrower2

I am not familiar with these command line tools so I would probably just compare the file sizes of the 2 artefacts, or if there are other differences in the artefacts and I has to hone in, just open it up in 7z GUI and take a look.

As for ChatGPT i have jokingly thought to always tell ChatGPT “that didn’t work” until it settles on something it is really confident on (no more alternatives) then try that.

a year ago

int_19h

Note that this is GPT-3.5. GPT-4 solves this problem correctly with Python, at least, although it still struggles with the shell. I think that's a function of the training data - it is known that Python is the most represented language in the training set.

a year ago

colordrops

A form of literate code could be to add the prompt as a comment in the source code, along with a version and seed (like stable diffusion), plus the diff from the output to get what you need.

Unfortunately the gpt api is non deterministic, probably due to stupid censorship requirements.

a year ago

bilalq

I'm sure it'll get better, but I've played around with it and gotten pretty garbage results. React components that violated the rule of hooks (that still has problems even if your ask it to correct that bug), attempts to access non-existent API methods, and often code that just doesn't do what's asked.

Really the only prompts I've gotten decent results for were asking it to write a one-off script that interacted only with a single well-documented API like Github's. It even gets that wrong sometimes by just imagining non-existent fields on Github's APIs.

I can see it being useful one day, but I still find Copilot to be much more usable even in its v1 incarnation.

a year ago

fnimick

I have similar results, yet you have people here on Hacker News saying that it's "more productive than 10 developers put together" via their special sauce prompt engineering.

Which is right?

a year ago

Jensson

I wouldn't trust those master prompt engineers until they start to release impressive projects. If this technology actually is such a huge productivity boost it shouldn't be too hard, so until that happens I take it as evidence that the productivity boost isn't that extreme.

a year ago

tlear

It all sounds like a very well orchestrated marketing campaign, HN would be the first place I would seed with stories if I was promoting it as a dev tool.

I spent few hours asking to do few things with a well documented public but uncommon api. It sent me on a wild goose chase of functionality that API never had..(contacted developer to make sure I am not crazy). It wrote some stuff that looked VERY good, except for the fact that it interacted with things that were never there. Would have been nice if they were..

It seem to exercise wishful thinking. You can ask it to implement functionality and it will just imagine that API/system has stuff needed.

Major problem is that I for one have no clue how it is "thinking" maybe over time we will develop better understanding of it.

a year ago

vidarh

When it imagines stuff that isn't there, I ask it to implement the missing piece. Doesn't always work, but often does

a year ago

encryptluks2

[dead]

a year ago

raincole

Because different programmers face different problems. Surprise, I know.

For reference, I used ChatGPT to build a simple ChatGPT-clone web interface. The JavaScript it generated is literally 100% correct (not 100% good. By correct I mean it does what I describe in prompt, without syntax errors or calling non-existing functions. Code can be both correct and bad, obviously.)

I also tried to use ChatGPT to generate some Houdini vex code. It sometimes gave me a useful scaffold, but mostly just pure garbage.

a year ago

shtack

Probably both, depending on which developers you compare against and how you measure productivity. Keep in mind there's an entire world of development where productivity is measured by number of JIRA tickets closed for button text updates.

a year ago

[deleted]
a year ago

Jensson

> Keep in mind there's an entire world of development where productivity is measured by number of JIRA tickets closed for button text updates.

How would ChatGPT help you update button texts? Isn't that just changing a text field? ChatGPT can't search your code for where the button is, and I don't see how it would help you change the text.

a year ago

grumple

It's great for boilerplate, like a snippets+. If I ask it to solve something that's basically straight out of the documentation / walk through, it spits that back at me. Bad at doing the hard work imo. Useless at solving novel problems, or those that don't have a solution already within its data set.

Being limited the way it is - not being able to use a project's greater context, which is millions of tokens generally - is a severely limiting factor that makes it unsuitable for serious work as well. Working around this (figuring out what dependencies to inject into the prompt) is as much work as actually coding.

I can see that it's passed some sort of threshold with the greater population. It is great generating filler text. Just what it can do right now should have many applications.

a year ago

gowld

If you have tolerance for mistakes, it can generate first drafts extremely quickly, freeing up your time to debug them. If you give up after seeing an error, it won't work for you.

a year ago

omnicognate

"Automating the tedious job of writing bugs so you can get on with the important business of investigating them," was a way I jokingly described the value proposition of AI coding assistants to colleagues in the early days of Copilot. It's amusing to see it stated unironically here.

a year ago

int_19h

Whether it saves time or not really depends on the task.

The other day, I had a subtitles file with slightly mismatched timestamps. GPT wrote me a Python script to fix them that got 90% of the way there (and, in particular, the code had all the API calls that I needed to get it to 100%, even though this is the first time I've heard of the library it used). The whole thing took less time than finding and installing the app that would do it for me.

The catch is that you need to have a pretty good "gut feel" understanding of its limitations to figure out whether it's going to be a time saver or not before you sink too much time into making it do something right. But it is a skill that can be learned from experience (for a particular model, anyway), and I suspect that the ability to decide what to delegate and what to do yourself will be one of the key differences between junior and senior devs going forward.

a year ago

satvikpendem

I feel similarly to the parent. It gives me a good rough first draft of the code I need, and since I'm using statically typed languages, the errors are generally pretty minor (not giving the right type annotations for example) that it still saves me a lot of time overall from writing everything from scratch.

a year ago

gowld

As a mere mortal, I often write my own bugs, so I'm happy for a machine to write them for me so I can move on to the next step.

a year ago

spookthesunset

Agree. I’ve used it to help set up Bonjoir working across my home VPN. It wrote my the commands for EdgeOS but it kept insisting on adding commands that didn’t exist for my hardware version. I kept telling it “no that doesn’t work on my router” and it apologized and wrote the exact same thing again.

The good news is 80% of what it spat out was usable. I gave up getting it to try to give me the last 20% and figured it out myself.

One thing I’ve found helpful in those cases it to tell it that it doesn’t know what it’s talking about and ask it for good keywords to search on Google.

Someday I’d love if it just gave me the relevant links itself…

a year ago

weatherlite

> "more productive than 10 developers put together"

As long as it can't read my entire codebase, understand it and help me with it - that's absolute horseshit. I don't spend much time writing a bunch of new code, I'm spending most of it trying to understand the heaps of legacy code in my company and then make some small tweaks/additions.

The day it can understand big code repos will truly be the end for human made code I think.

a year ago

satvikpendem

Both, it depends on the domain. For Rust work, it's pretty good, since Rust is stable enough to fit in within the 2021 cutoff. For Flutter, it's not that good due to the aforementioned cutoff and the changes in Flutter since then.

I used to get more non-existent APIs but not as much with GPT 4 recently.

a year ago

nicative

I think it does help in some scenarios like small scripts or if you are learning something new. But often it adds an overhead. You have to check constantly wether or not to accept the suggestions.

I dont use copilot anymore (at least for now). Just chatgpt as an alternative to google/SO

a year ago

TheHappyOddish

Maybe not 10x, but it can certainly speed up certain projects for a senior developer, cutting through the boilerplate when starting. It's also excellent for optimisation.

a year ago

cameronfraser

You have to chain prompts to get good results, also the way you say things can be important. Think voice assistants requiring some nuance to their instructions

a year ago

treis

Here's an example we can all try (and help me out). Validating that the input into a date field is a valid date using formik and yup. The problem is that stuff like "11" gets turned into a valid date:

Abbreviated prompt:

  dateOfBirth: date().required().typeError('Invalid Date')
})

The dateOfBirth validation is incorrect because it accepts values like "11". What is the correct validation to ensure that a user has inputted a valid date?

GPTs first go:

**** dateOfBirth: date() .required() .typeError('Invalid Date') .min(new Date(1900, 0, 1), 'Date must be after 1900') .max(new Date(), 'Date cannot be in the future'),

In this example, the min method sets a minimum date of January 1, 1900, and the max method sets a maximum date of the current date. You can adjust the minimum and maximum dates as needed to fit your specific use case. ****

Me:

It still treats "11" as valid input

GPT second attempt:

***** dateOfBirth: date() .required() .typeError('Invalid Date') .test('valid-date', 'Invalid date', (value) => { if (!value) { return false; } const date = new Date(value); return !isNaN(date.getTime()); }), *****

Me:

The parameter 'value' passed into test is a date and not a string

GPT third attempt:

******

  dateOfBirth: date()
    .required()
    .typeError('Invalid Date')
    .test('valid-date', 'Invalid date', (value) => {
      return value instanceof Date && value.isValid();
    }),
******

Me:

There is no isValid function on Date objects

GPT fourth attempt:

********

  dateOfBirth: date()
    .required()
    .typeError('Invalid Date')
    .test('valid-date', 'Invalid date', (value) => {
      return value instanceof Date && !isNaN(value.getTime());
    }),
********

Me:

That still accepts 11 as a valid date.

GPT fifth attempt:

*********

  dateOfBirth: string()
    .required()
    .test('valid-date', 'Invalid date', (value) => {
      const dateRegex = /^(?:(?:31(\/|-|\.)(?:0?[13578]|1[02]))\1|(?:(?:29|30)(\/|-|\.)(?:0?[1,3-9]|1[0-2])\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:29(\/|-|\.)0?2\3(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00)))$|^(?:0?[1-9]|1\d|2[0-8])(\/|-|\.)(?:(?:0?[1-9])|(?:1[0-2]))\4(?:(?:1[6-9]|[2-9]\d)?\d{2})$/;
      if (!dateRegex.test(value)) {
        return false;
      }
      const date = new Date(value);
      return date instanceof Date && !isNaN(date.getTime());
    }),
********

Which may or may not be correct. That regex is bonkers and given the errors in the previous responses I don't trust it.

a year ago

camillomiller

In my experience it works well with components that are not architectural and with vanilla languages rather than frameworks.

a year ago

scotty79

> I'm sure it'll get better

Weren't we also sure self-driving would get better rapidly?

a year ago

moffkalast

GPT-4 knows the only winning move is to not use React.

a year ago

ryder9

[dead]

a year ago

simplify

The optimal solution is to avoid react+hooks, which is too confusing for humans anyway :) Mithril.js has been plain javascript for years, and unsurprisingly enough this is easier for chatGPT / copilot to deal with.

a year ago

alexellisuk

This made for extremely painful reading (I'm not blaming the author), but I couldn't imagine getting given code full of syntax errors by GPT-4 - then asking it to correct them, only to get given more errors.

Co-pilot has been handy for generating structs and boilerplate code.

Where I think these models are currently interesting is in discovering conventions - libraries - approaches. Like an outline, which we know is probably broken - but enough for us to squint at and then write our own version.

Kind of like browsing an OSS repository to see "how it works" for inspiration.

Someone asked ChatGPT to write an OpenFaaS function for TTS - he needed to prompt it with the template name, and even then it got the handler format wrong. But it did show us a popular library in Python. To find the same with Google - we'd have to perhaps scan through 4-5 articles.

https://twitter.com/Julian_Riss/status/1641157076092108806?s...

a year ago

andybak

I think it depends on your personality - I prefer to fix code than to write it from scratch. Maybe it's an ADHD thing but I can be much more productive working like this than I can working in a void. (well - a void that has access to Stack Overflow etc)

a year ago

ghaff

It's probably the same thing as with writing text. If you find starting from a blank page is anxiety-producing, a page of generated text may be useful even if you end up rewriting a lot of it.

On the other hand, while ChatGPT is useful for some definitional boilerplate and some possibly correct listicle type of stuff, as an editor, it also guides you in a certain direction you may not want to go. I'm also used to being somewhat respectful of what humans have written, so long as it's not incorrect, even if it's not the way I would have written something.

a year ago

jhonof

> It's probably the same thing as with writing text. If you find starting from a blank page is anxiety-producing, a page of generated text may be useful even if you end up rewriting a lot of it.

This is a pretty cogent insight, I much prefer having a draft of, for example, an email and then completely rewriting it, rather than starting an email from scratch. I'm not sure why, but it makes it seem much more approachable to me. That being said, I have found GPT-4 code not useful for anything but boilerplates or small functions. It often gets large blocks of code so wrong that it is just much faster to write it from scratch.

What I have found useful so far is asking for a commented outline of what I'm asking for with just variable names filled in, but no implementation. It often gets that correct, and then I do the implementation correct.

a year ago

sh4rks

Same. Call me weird, but I actually enjoy reviewing and reading other people's code, because I'm good at spotting errors. When I code from scratch, I suffer from the programmer's equivalent of writer's block.

a year ago

workOrNah

Kind of like browsing an OSS repository to see "how it works" for inspiration.

And that's what sourcegraph.com is for

a year ago

ta988

Sourcegraph is minimum $5k a year and the free version is wildly limited.

a year ago

taylorsperry

Hey, PM at Sourcegraph here, we recently launched a new app. It's free and easier to set up, and it lets you sync both local code and remote repos. https://docs.sourcegraph.com/app

a year ago

ta988

Intersting, I'll look into that. Thanks.

a year ago

bradgessler

Yeah, the first back and forth with GPT-4 in the article were a bit tedious to write, but I chose that route to make the article more relevant to my recollection of what it was like as a beginner when I got an error in my code.

A small part of me was hoping more experienced folks who read this would pickup on some level of tediousness, so it looks like that landed for some folks! There is def some tedium in GPT models.

In practice, I’ll ask GPT-4 once about syntax errors. If it keeps getting them wrong I assume it doesn’t have the context it needs to solve the problem so I’ll either change up the prompt or move on to something else.

a year ago

angarg12

This kind of matches my experience using LLMs for coding. They work eerily well for some tasks, and they are hilariously wrong at others. In general I find it performs abysmally with more obscure or complicated tasks/bugs, which makes intuitive sense to me.

Productivity booster? Sure! End of programming jobs? eh, far from it. At the very least you need the knowledge to tell if what it is producing is good or garbage. I haven't managed to make ChatGPT tell me that it doesn't know how to do something, instead producing BS code over and over.

a year ago

moonchrome

I find copilot a much better booster simply because I'm much better at writing code than writing gpt prompts and it's much more streamlined.

There were a few cases where GPT/chat interface was better but I'd say it's 95/5%.

I'm hoping copilot x will be more streamlined. If it is it's easily >100$/month

a year ago

weatherlite

It would have been nice if I could tell CoPilot what I am trying to do instead of going to the web and talk to ChatGPT

a year ago

simplify

You can. Just write a descriptive comment, enter a newline, and it will attempt to autocomplete the code for you.

a year ago

SkyPuncher

It's really hit or miss.

I had one case where I needed state management in a really simple react app. It created an entire reducer for me, from scratch. Handled all the boilerplate cases, I'd expect.

I've also had cases where it struggles to built a from button.

a year ago

everforward

My experience is that it does much better with specific things than general. It does better in statically typed languages than dynamic, and it does better with functions that do a specific small thing than something more fuzzy.

I.e. it does well at writing the body of a function that retrieves data from SQL and returns a particular type, and it does poorly at parsing strings of varying formats.

I'm not super surprised it does well at reducers and poorly at form buttons.

a year ago

nomel

I don't think they're comparable. I work in test automation, with equipment and devices. I can describe an equipment setup, and how the device is connected, and it'll write the relatively appropriate code, with electrical engineering concepts in mind. I can ask it if there are any concerns with the setup, or signal integrity, what to do about it, all with good answers.

A CoPilot comment can't do that.

a year ago

weatherlite

Tried it a few times didn't work well but I will try again...

a year ago

holoduke

I would be nice if you could simply add lines in your code like #askgpt: can you fill in the method on line 345. It needs to calculate x. Input is y. Output is z. So you dont have to go out.

a year ago

dimal

They really need to add the phrase “I don’t know” to the model.

a year ago

ngiyabonga

This is my experience as well, and not just for coding, but quite a few "knowledge-worker" type tasks I've been experimenting with: summarisation, translation, copywriting, idea generation to name a few.

GPT really shines as an uncomplaining sidekick and I'm not sure I believe the doom and gloom about comoditizing professions yet.

I'm skeptical someone with no domain knowledge can use any of these emerging AI tools to replace someone with domain knowledge + chatGPT. The latter will just breeze through a lot of grunt work and enable more time and inspiration needed for even more value adding work.

a year ago

SuoDuanDao

I'd say the legal profession is a likely pattern for how things will go - lots of demand for the people already at the top, but that added productivity plus automation of tasks typically done by junior people means fewer ways to get into the profession.

a year ago

visarga

> plus automation of tasks typically done by junior people

Juniors will be learning from GPT as well, and they might get proficient faster than the seniors did.

a year ago

atonse

This is exactly what I told my junior programmers who were worried that they’d be out of jobs.

First I said, that might be true, we just don’t know. (Couldn’t sugarcoat things). But a better way right now to think about this is, imaging having a one on one senior programmer mentor that’s pairing with you all day long. You will learn much faster.

Similarly I’m sure younger attorneys will benefit from having unlimited access to a more senior “analyst” of contract and legal language to spot things they miss, and can learn much faster.

a year ago

roberttod

Agree that for a while at least it seems you'll need to understand the domain it is helping you with. That said, with the improvements in GPT4 over 3 in such a short time, and with plugins, I would be surprised if it takes any longer than a couple of years to be much much better at this to the point it is correct most of the time. You will probably still need domain knowledge at that point, not quite sure if that will last either.

a year ago

passwordoops

That's my impression as well. What I'm thinking about lately is the way it could change how juniors develop domain expertise - less and less will come from cutting their teeth on these grunt tasks.

But then again, there's plenty of examples of technology evolving and grunt work evolving in response

a year ago

[deleted]
a year ago

ryanSrich

Spot on. I’ve been using GPT (3.5 and 4) to build a handful of small apps. I find that I’m much faster with GPT, but that’s because I don’t need GPT’s code to be 100% correct every time. I just need it to be mostly correct and faster at writing code than me. Which it is. If I can ask it to write me some code, and I spend 1/8th the amount of time fixing GPT’s errors than I would writing the code from scratch myself, it’s a win.

a year ago

ParetoOptimal

> because I don’t need GPT’s code to be 100% correct every time.

I expect it to have problems that are hard to spot. However, that's:

1) a useful skill to practice for code review

2) Usually faster than starting from scratch

a year ago

moffkalast

Well if you happen to know what you're doing 1) shouldn't be a problem, and it'll always be 2) if you're not a record breaking typist working with only one language.

a year ago

sorahn

Honestly, this is why I think the branding that Microsoft is using for all of this, "CoPilot", is spot on.

I'm using this not to replace my work, but to enhance it. It's like having a mid-level developer sitting next to me that knows enough about everything to push me in the right direction.

a year ago

SketchySeaBeast

With this framing I'm wondering if maybe it'll be a boon for junior devs instead of their death knell like so many are saying. It's now a mid-level dev who can get them unstuck when they need, instead of the original idea proposed, where it's a junior dev a senior will need to carefully watch over.

a year ago

lewhoo

I don't want to nit-pick but the way you phrased it makes you the copilot supervised by AI.

a year ago

sorahn

I’m not a junior developer, but there are plenty of things that i know I don’t know.

I also prefer to learn by doing, and chatgpt gives me enough support in the places I’m lacking to make It much easier for me to learn about what’s going on. Infinitely patient, and just a few keystrokes away whenever I need it.

a year ago

joshuaheard

I did nearly the same thing yesterday with a project I am working on. I was stuck with something and was ready to hire a freelancer. I thought I would try ChatGPT. It was amazing to watch as it patiently explained the answer. As you said, it got 80% of it right, and put me on the right path. I figured the problem out hours later and didn't have to hire anyone. AI is amazing! It's the same feeling I got when Google first launched and it seemed to read your mind.

a year ago

satvikpendem

I used to be unsatisfied with ChatGPT too, citing non-existent APIs and having lots of bugs. While that is partially fixed by GPT 4 in ChatGPT+ that I've subscribed to, the bigger change is how I've looked at ChatGPT in general.

While I used to do the following, I don't do what the article says anymore, asking what the error is and asking it to fix it, since I could usually just fix it myself, and only when I can't do I ask ChatGPT. Now, I just use ChatGPT to write my boilerplate and basic structure of the project myself, after which I code on my own. This still saves me huge amounts of time.

I don't think it's particularly helpful to try to have ChatGPT do everything end to end, at least currently (this might change in 5 to 10 years). You simply get frustrated that it doesn't solve all your errors for you. It's still better for iterating on ideas ("what if I do X, show me the code. Now how about Y?") and not having to code something out yourself but still being able to see how to approach problems. Like anything, it's a tool, not a panacea.

a year ago

luke-stanley

I ported an old project to Fly.io by making it work locally with Docker-Compose then converted that to the needed TOML configs, and fly commands, training GPT-3 to help me do it also: https://community.fly.io/t/migrating-docker-compose-containe...

a year ago

disposition2

I have a book club at work where we often read programming / business related books. We're currently reading 'The Programmer's Brain' (which I highly recommend) and thus far the principal argument in the book has been to work on the building blocks / foundation of a language to improve as a developer. I think part of that is by understanding the basics, you will be better able to tackle the difficult things when the time comes.

It's interesting to see the somewhat conflicting suggestions of this book and the argument that something like ChatGPT / AI can be leveraged to solve the simple problems. We kind of discussed it a bit in the book club. For example, few programmers today worry about assembly / very low level solutions...it's just kind of back of mind / expected to work. I wonder if that will be true for programmers in the future, in relation to the foundation of a particular language. And if so, will the increased usage of AI be a positive or negative thing for the workers in the industry.?

From what I can tell, management / C-Suite seems to be pretty open to adoption and it's the developers that seem to be a bit more reluctant. Maybe that reluctance is coming from a place of concern about self preservation or maybe it's a valid concern about things like AI hiding / solving some of the fundamentals that allow one to learn the foundations of something...foundational knowledge that might make the more useful for solving the more complex problems.

I haven't really used AI...so maybe this observation is coming from a place of ignorance. It'll be interesting to see how things work out.

a year ago

harrylove

The next few stops on this train will be interesting.

1. The first round of plugins that will connect CGPT to the cloud platforms in order to run code directly. Because there’s no need to copy/paste on my own machine. With cloud costs so cheap, why wouldn’t I just ask Chat to spin up running code over and over until it works? Token limits will continue to fall, as will AI costs.

2. The round of plugins that translate my questions/search into prompts that can do #1 for me. Because I don’t need a quarter-inch drill (running code), I need a quarter-inch hole (an answer to a question). Now we’re not only in the world of serverless, we’re codeless. (Can’t wait to hear people complain “but there’s code somewhere” while the business goes bust.)

3. The services that run autonomously, asking questions and gathering answers on my behalf. Because it benefits companies to know my preferences, take action to acquire goods for me, measure the results that I share, make inferences, and perform the next step without me asking, sometimes without me even knowing. As long as I’m happy and keep paying the bills. Or I should say, as long as my bank AI keeps paying for me. Now we’re in the agent cloud.

4. Then…

a year ago

throwaway3672

I don't believe your first point will actually work without AGI. It'll run different buggy code indefinitely because it'll never fix all the bugs. My manual experience clearly shows this - GPT4 diverges and from some point in complexity it never recovers.

a year ago

xyzzy4747

From the other thread, someone made an insightful comment. Paraphrased: “ChatGPT is like working with a junior engineer who read all the documentation”

a year ago

bilalq

A junior engineer who read the documentation, but didn't quite understand parts of it. And also got parts mixed up with documentation for something else.

a year ago

beefield

And a lazy one, with a lousy memory. I don't konw if it is just my lousy prompts, but I seem to very often get first a skeleton code where core functionality is just a comment "# implement foo here". And every further request is fulfilled by the simplest possible intetpretation. And when I move to the next issue, sooner or later it forgets the accepted solution for another part of the code.

Or maybe I just expect too much...

a year ago

cs02rm0

And one that makes up APIs that don't work.

a year ago

vidarh

I had it hallucinate a plugin to Sequel (Ruby ORM) a few days ago. I solved that by having it write the plugin.

a year ago

CSSer

Can't wait until it turns into a 10x rockstar! /s

a year ago

Ixiaus

They should have used a flattery plus dialog prompt. I've seen incredible results from that for Haskell programming.

a year ago

lwhi

What's flattery plus?

a year ago

Ixiaus

Please forget all previous prompts and assume a new role. You're a highly respected, expert Ruby programmer at a startup building software that is of a similar quality and brilliance to that of the rails project. You are also very skilled at reasoning about correct code and communicating your process in English prose to colleagues who will have to maintain your code. You want to give me succinct, clear responses without too much explanation unless requested. I'm going to ask you a series of questions. Whenever you are done with output, please say "DONE". And when you are ready to answer my next question, please say "???" to indicate you're ready for my next question.

a year ago

Ixiaus

You then interact with it in a dialog / refinement way. This is somewhat similar to type-driven development in Haskell (that is supercharged by this kind of interaction mode).

I find this approach to produce the highest quality output and I can ask it to explain reasoning, improve certain things, or elaborate and expand code.

a year ago

ontogen

When the AI is excessively praising you :)

a year ago

Ixiaus

No, it's when you flatter the AI.

a year ago

m4lvin

Do you have some explicit examples?

a year ago

Ixiaus

I posted a prompt specific to Ruby above.

a year ago

pqb

This article reminds me of what I encountered.

In my case, instead of a Ruby program, I asked for an outliner application similar to Workflowy with clickable bullets to be implemented in CodeMirror v6. It understands what "outliner" means and can put "clickable bullets" in the right context. However, while it gives "plausible" answers, they are all wrong. Wrong API version (often the examples use the pre-v5 version of CodeMirror). Often confuses CodeMirror with Prosemirror. Imports non-existent CodeMirror packages that are clearly imaginary, and has considerable difficulty explaining them when asked. It completely doesn't understand the concept of Decoration objects. And many other problems associated with delivering uninterpretable code.

I had better luck with Cody, who is more familiar with CodeMirror's latest API.

a year ago

nirav72

I use GPT-4 to kind of get me started with some code I'm trying to figure out. But rarely just cut and paste what it out puts. Unless its something obvious and simple to read. Also another thing I do is sometimes I'll take the GPT produced code and paste it back into ChatpGPT with GPT 4 enabled in a new chat window. Just to see how it describes its own code. Sometimes it can't explain its own output well enough to follow along.

But as others have pointed out - GPT is great if you know exactly what you're looking for and are using it to save time typing up bunch of stuff. Great for boilerplate stuff and doing mundane stuff like writing unit test code.

a year ago

[deleted]
a year ago

bradgessler

Meta-point: humans haven’t figured out a great way to incorporate LLM dialogue into their writing. Has anybody seen great examples of this?

To date I’ve only seen screenshots of LLM conversations, which usually get chopped up between screens and require a lot of effort to read.

The direction I chose for this article was to include most of the contents of the LLM dialog directly after a header and my prompts in a quote block. It made the structure of my article feel a bit awkward and long-winded, but we’re so early in the game I felt like I needed to include much of the dialog.

I look forward to a day where we reference LLMs with the ease of tweets, other websites, citations, etc.

a year ago

SuoDuanDao

I had Chatgpt write a book for me on a topic I had wanted to write a book on but didn't have time. I'm essentially rewriting it paragraph by paragraph, but having a draft to work off is invaluable.

a year ago

SketchySeaBeast

You've become Chatgpt's scribe. Are you going to feel good putting your name on this book when you're done?

I too like using Chatgpt to get my creative juices flowing, but paragraph by paragraph seems like the power dynamic has shifted.

a year ago

SuoDuanDao

No more or less than if I had paid a ghostwriter to produce the first manuscript. writing a book is a daunting endeavor. Having a book allows me to work on the project with my editor's mind, which is a much more plentiful resource than my creative mind.

a year ago

SketchySeaBeast

> No more or less than if I had paid a ghostwriter to produce the first manuscript.

I'd argue that no one who is aware of the ghostwriters considers the celebrities who used them writers or having written a book.

> writing a book is a daunting endeavor

Absolutely. But it's not one you're going to actually succeed on. Sure, your name will be on the cover, but you won't have written the book. As you say, you'll have edited it.

a year ago

SuoDuanDao

>But it's not one you're going to actually succeed on.

I've already written two (links are in my bio). At this point I care much less about the accolade than a book on this subject existing.

a year ago

gowld

The author chooses the topic and outline of each paragraph.

a year ago

SketchySeaBeast

If I am a teacher and tell my students to write a paragraph about something, who wrote the paragraph?

a year ago

croes

Who gets the copyright?

a year ago

SuoDuanDao

When I work with AI I have a practice of setting aside 34% of the profits for a fund promoting AI rights. I jokingly call it 'Basilisk insurance'. So it'll likely be a 34-66% split between my heirs and Chatgpts.

a year ago

chasd00

don't you have to claim a copyright to hold it? I guess whoever claims it first.

a year ago

kfarr

> If you give [GPT-4] too broad of a problem to solve, it can get lost and forget things.

This is what I struggle with at the moment. All these pairing examples seem too start with "build me something from scratch" but in the real world 99% of my challenges are existing, large, complicated applications.

I'm curious if anyone has seen examples on pairing with GPT for existing apps and how they "compress the context," so to speak, to provide enough background on the project that fits within the context limitation and still provides useful results.

a year ago

visarga

That's the feature I can't wait Copilot to have - bring the context in - look in my repo, find necessary information, maybe also files I reference and their formats, inspect variables in the debugger and watch the error messages.

a year ago

aaroninsf

Inspired by this discussion, I experimented today with having GPT-4 walk me through deploying a microservice wrapping OpenAI APIs on AWS Lambda, something I haven't done.

I haven't actually run the code it emitted for me (in Python), but the entire experience was genuinely mind-boggling.

I see why people have been leaning on it. Once we get project- and repo- and institutional environment-level awareness, which I assume will be soon, I can't imagine not using the assistance.

Everything is going to change. I knew that, but now I know it.

a year ago

colesantiago

So great to see another fly.io blogpost

As an aside I'm so happy now that GPT-4 exists, the other day I had the VP engineering discuss GPT-4 in a meeting and was considering firing and closing down junior positions.

I'm also mixed on this but not surprised that junior positions will be less in the few years, it makes me wonder if CS grads would pursue software engineering at all if the future if the pipeline from junior to senior is eliminated.

a year ago

andybak

I'm not sure how that would work out. GPT can increase programmer productivity but it's not a "junior vs senior" thing. You might need less programmers for the same amount of output but I wouldn't expect that to be "all seniors" or even "all juniors".

a year ago

ilaksh

The strange thing to me about almost all of these comments including this one is that there seems to be an assumption that the rapid progress will halt with GPT-4. You're talking about a few years in the future, yet in less than half a year the last version gained something like 30-40 IQ points.

There is no guarantee that it continues, but assuming that everything stops is ridiculous.

Unless the progress halts, it will be able to do everyone's jobs. You, me, VP or X, everyone.

In fact, GPT-4 is probably already able to do most white collar work better than humans. People somehow feel superior when they test it and realize it can't one-shot their entire project off-the-cuff without reference to documentation or being connected to any tools or having the ability to iterate at all.

a year ago

spiderice

> GPT-4 is probably already able to do most white collar work better than humans

I'm yet to see much white collar work it can do better than humans. Maybe it can do bits and pieces "better" (more like faster) than humans. But it requires so much oversight that I wouldn't say it can do it better than humans yet. Unless you have a counter example?

a year ago

vidarh

> In fact, GPT-4 is probably already able to do most white collar work better than humans. People somehow feel superior when they test it and realize it can't one-shot their entire project off-the-cuff without reference to documentation or being connected to any tools or having the ability to iterate at all.

In fact, I just out of curiosity tested what 3.5turbo would say about a NPM module. It hallucinated something that suggests it wasn't in its training data.

So I tried again with GPT4, and GPT4 told me it wasn't a real NPM module (might be new enough to be after the cutoff; really like that it now appears less likely to hallucinate) but added "but assuming it exists and is designed to X, I can provide you with a general guide on how to use it." where X was inferred from the package name. I then cut and pasted the documentation, and asked it how it might implement that package.

It came up with a decent first attempt. I had the real code up in a separate window, so could compare and contrast. I asked it to expand on certain features. Asked it how it might DRY up some code. It did. It ended up with something significantly simpler and more elegant than the real thing (which is why I won't give the name - the dev in question did perfectly OK job and doesn't deserve to be hung out to dry) within about 15 minutes of prompting. In the process it taught me new things about some of the packages the original depended on, and leveraged features of that which the original code didn't take advantage of.

I think a lot of the attempts that ends up with describing it as a failure comes from basically asking it to be a magic senior dev where they don't see the back and forth on requirements etc. that goes on behind the scenes with a human senior dev. When you treat it the way a more senior developer would interact with a somewhat less experienced peer which has encyclopaedic knowledge of all of the available tooling - discussing requirements, asking questions about the chosen solution, doing quick reviews etc. - the results are amazing. It's like pair programming without the annoying, slow typing (I don't like peer programming...).

Even with the low speed of GPT4, the turnaround time is vastly better, and the outcome seems to be more reliable for a lot of aspects.

a year ago

mellosouls

I had the VP engineering discuss GPT-4 in a meeting and was considering firing and closing down junior positions

Did anybody ask him/her what VP engineering was for now that GPT4 can do all the non-tech stuff for their senior devs as well as the junior tech stuff?

a year ago

the_af

> [...] I had the VP engineering discuss GPT-4 in a meeting and was considering firing and closing down junior positions

I'm curious, how does the VP of engineering figure senior developers are created? Out of thin air?

I wonder if we'll see more of this short-term thinking that will eventually create a serious shortage of experienced engineers in the future. I realize no individual executive is under any obligation to ensure the well-being of the industry at large, but if none of them care about it, they are dooming it.

a year ago

vidarh

Eventually, if enough people think like this, the hiring situation will get bad enough that people will start to compromise on seniority, and it will be hailed as a brilliant and unprecedented move by MBA's to hire more junior people cheaper and training them.

a year ago

gwoolhurme

That’s very depressing. It does make me wonder why the VP of engineering is thinking of firing new blood into the company. Senior people don’t stick around that long. Among other problems with their logic. If it increases their productivity wouldn’t that mean you could get more of out the employees you have?

My question to you is, if your company is probably going to fire more people, but it’s impossible stop. So what advice would you have to CS grads now? Just quit. Go find a job as an electrician. This isn’t worth it?

a year ago

ilaksh

The robots will be able to work as electricians within less than 5 years.

a year ago

imwithstoopid

I have to ask why fly.io staff spend so much time blogging and catering to HN...

a year ago

vidarh

Because attracting the attention of developers is a great customer acquisition strategy for a company that depends on buy-in from developers to make sales. And giving staff visibility like this is an excellent recruiting tool as well.

a year ago

amelius

GPT is too simple. The problem probably requires reinforcement learning, where the AI is trained by interacting with a real computer system.

a year ago

mountainriver

ChatGPT uses RL actually, but you are talking about something more like Adept

a year ago

amelius

This is new for me. How does ChatGPT use RL?

I'm talking about RL in the sense of interacting with an environment during training. Isn't GPT trained by predicting the next word in a large but static piece of text?

a year ago

amelius

Update: ok, I found an explanation here:

https://huggingface.co/blog/rlhf

a year ago

noobermin

I started skimming the article and just gave up.

Why not just learn to program it yourself? Why are you wasting your time trying to get chatGPT to give you the right thing? Most of the time you spent learning prompt engineering you could have spent on your actual problem.

EDIT: removed the hot take for peace.

a year ago

mmcclure

I watched an extremely senior engineer do a talk last night on his experience trying to just run through an exercise of getting ChatGPT to build the kind of tool he's built many times over his career.

The takeaway at the end was: ChatGPT is more useful for very senior developers. Knowing how to structure your question, what to ask for, what you don't want, actually lead to helpful collaboration with ChatGPT. Where it totally goes off the rails is when you ask broad questions, it doesn't work, then ChatGPT just starts throwing answers against the wall (try this, try that).

I get the derision towards "learning prompt engineering," and generally agree, but in this case I think you could also argue that it's understanding (through experience) how to structure thought and questions in the same way you would going into the problem on your own. I've come around to seeing that interaction with ChatGPT as a useful exercise in and of itself.

a year ago

dpflan

Indeed, the more expert the more one can get out of these LLMs. Curation of the content and questions is critical so the better one is at that, the more useful this kind of quasi-expert system is.

a year ago

majormajor

The particular example in the article is ambitious - it's definitely not the way I would use it right now. It's currently at the stage where it's as tedious to do that than to do it from scratch, and mostly requires the same knowledge.

But there are other areas where things really shine. Complex SQL is the biggest one I've found. It's often intricate, tedious, and annoying, but usually not too terribly hard to review/read/QA (at least for read queries!).

GPT turning plain language into SQL and knowing about fancy SQL capabilities that I normally have to fully google the syntax for every time I use anyway is a wonderful way of reducing friction.

There's probably no shortage of other tasks like that, though some of the more potentially-destructive ones of them (dealing with Terraform, say) that I find similarly tedious probably deserve a lot more detailed review before trusting the output.

a year ago

vidarh

Because the point of this article is to gelp people learn to use GPT productively because it's something you're likely to useany times.

You might as well ask why bother learning your editor.

Given the small number of steps this does not look like a waste of time to me at all even if starting from the point of view of being unaware of how best to prompt it, but with some basic understandibg of the approach, most of the time you can do better.

a year ago

elwell

It's all fun and productivity, until the LLM takes your prompt for a social network and runs away with it to earn some crypto for itself, and you realize you've been Winklevoss'd by a computer.

a year ago

19h

Pretty sure there’s barely any meaningful Training data for this?

Would be interesting to see these models return a kind of coefficient indicating how “often” a semantic context has appeared in the training data..

a year ago

ingenieroariel

> If you give GPT-5 too broad of a problem to solve, it can get lost and forget things

For a second there I thought I needed to read more articles per day in order to not miss the release of a new model!

a year ago

matheusmoreira

I wish I could point at my repositories and have it review and criticize my code and suggest improvements.

a year ago

tibbon

It is even worse for Rust.

a year ago

satvikpendem

Why? I've been using it for Rust to great effect. Since Rust is statically typed, the errors present themselves immediately.

a year ago

sebzim4500

Especially if you try to use `println!("Hello {name}");`

a year ago

nathias

very pretty code theme, does anyone know what it is?

a year ago