neteroster's recent timeline updates

neteroster

V2EX member #191331, joined on 2016-09-11 21:01:55 +08:00

Today's activity rank 989

neteroster 提问技术话题好玩工作信息交易信息城市相关

Per neteroster's settings, the topics list is hidden

Deals info, including closed deals, is not hidden

neteroster's recent replies

4h 36m ago

Replied to a topic by xyz8899 › OpenAI › 总结一下这次 OpenAI 封号的问题，同时谈谈大家忽略的一个问题（可能这才是封号的元凶）

有点幽默了，这次是纯误封，可以看 status.openai.com 和相关负责人 X 上的发言

19h 1m ago

Replied to a topic by w568w › OpenAI › 我的 GPT 5.5 怎么和你们的不一样？

@w568w 那我很怀疑是你的 harness 问题了，GPT 在各种 bench 和体验反馈都是执行强，deepswe 有案例分析，原文
```
GPT implements exactly what's asked
On DeepSWE, GPT-5.5 has the lowest rate of missing stated behaviors of any configuration in the chart; GPT-5.4 sits just behind it.

GPT reads the prompt and the visible repository contract literally, and produces a patch that honors both. The behavior is consistent across runs: when several GPT trials attempt the same task, they tend to converge on the same interpretation of the prompt, suggesting this precision is a stable trait rather than per-run luck.

A natural follow-up would be to examine whether this precision comes paired with related stylistic traits, like overly defensive code, surplus error handling, or other markers of a tightly instruction-anchored coding style.
```

1 day ago

Replied to a topic by w568w › OpenAI › 我的 GPT 5.5 怎么和你们的不一样？

@neteroster 还有一个和工程代码没那么相关的就是 opus 世界知识现在似乎已经是御三家最差了，5.5 长尾世界知识已经有半步 Gemini 水平了（甚至例如 ACG QA 这类以前 GPT 差的离谱的领域），再加上最前沿的数理知识/推理水平，导致写起研究类实验代码非常舒适，我不知道其他细分领域是否也会有这样的情况，但就我自己做数学交叉方向的经验来看，只要涉及数学推理的代码我只能相信 GPT 系列

1 day ago

Replied to a topic by w568w › OpenAI › 我的 GPT 5.5 怎么和你们的不一样？

5.5 是执行的神，opus 是规划的神，我不明白有什么冲突的

opus 无论 4.6,4.7,4.8 执行就是不行，我真的不明白，一份十分明确的 spec 给进去执行出来就还是会有明确漏项或者矛盾的地方，5.5 甚至 5.4 就完全不会有这种问题（这就是为啥 5.5 在 deepswe 之类的 bench 表现如此之好）

opus 的优点就是偏好对齐，还有讨论方案，这些微妙的地方，5.x 完全不行

2 days ago

Replied to a topic by windsound › 问与答 › ChatGPT 的手机 app 设置中能看到一个很奇怪的手机号是什么原因？

23 年左右是要接码注册的，也许你那时候用了接码平台自己忘了

目前无法更换，如果 Codex 需要二验就等不了

3 days ago

Replied to a topic by wcwcxiaobin › 程序员 › 有没有比 whisper large v3 更准更强的

那太多了，中文的话豆包，多语言的话

elevenlabs ，soniox

May 31

Replied to a topic by inostarling › Codex › Codex 写代码确实好用，就是限额根本顶不住

fast 一般 tps 50 提到 70 （举例，实际 baseline 会不同），价格 2.5 倍，谁爱开谁开反正我不开（

May 29

Replied to a topic by idblife › Codex › codex/claude code200 美金的套餐实际价值多少刀的 token？

Codex Pro 20x 比较动态根据号不同，根据反馈，平均水平大概是周限 1500~2000 刀左右，换算到月就是 6000-8000 刀，被风控可能掉到周限 <1000 刀

Claude 反馈不是很多，20x 之前看有人说周限 3000 刀+(提额之后)，不确定真实性