Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Open to Collab
19
6
24
AbstractPhila
PRO
AbstractPhil
Follow
mystifying's profile picture
FiditeNemini's profile picture
Sperminator's profile picture
85 followers
·
124 following
https://civitai.com/user/AbstractPhila
AbstractEyes
AI & ML interests
datasets, research papers, experimentation, vision, classification, text encoders, tokenization, llms, diffusion, distillation, and more.
Recent Activity
updated
a model
1 day ago
AbstractPhil/geolip-aleph-void
reacted
to
OzTianlu
's
post
with 🧠
2 days ago
ResNet is Explicit Euler. GPT is Implicit Euler. What Else is Hiding in Plain Sight? Read online: https://datawhalechina.github.io/learning-terrain/ I wrote an open-source monograph on learning dynamics — The Terrain of Learning. Bilingual (Chinese/English), 4 volumes, 12 chapters, 30+ print-grade figures. Completely free (CC BY-NC-SA 4.0). The core argument: gradient descent is not optimization. It's terrain motion. The loss function is a landscape. The gradient is the direction of slope. The optimizer is how you choose each step. Once you see it this way, everything clicks: ResNet = explicit Euler integration on a vector field. The residual branch is the vector field. Each layer takes one Euler step. GPT autoregression = implicit-state Euler iteration. Stable where explicit Euler explodes. That's why transformers handle long-range dependencies. DEQ = the Banach fixed-point theorem in production. The forward pass is root-finding. There are no layers to backprop through. KL divergence = a Bregman divergence on the entropy landscape. Your belief space is curved, not flat. Chain-of-thought reasoning = hidden states flowing along a reasoning field toward an attractor basin. Correct answers have wide basins. The number of reasoning steps is determined by the terrain, not by the problem. Diffusion models = systems flowing downhill along a score vector field, from noise to structure, from high energy to low energy. The book traces one idea across 337 years — from F=ma (Newton, 1687) to H=T+V (Hamilton, 1833) to loss landscape + gradient field (2020s). Hamilton replaced a catalog of forces with one geometric object. This book does the same for deep learning. GitHub: https://github.com/datawhalechina/learning-terrain Discussion: https://github.com/datawhalechina/learning-terrain/discussions/2 Convergence is not hope. Convergence is geometry. You see.
replied
to
OzTianlu
's
post
2 days ago
ResNet is Explicit Euler. GPT is Implicit Euler. What Else is Hiding in Plain Sight? Read online: https://datawhalechina.github.io/learning-terrain/ I wrote an open-source monograph on learning dynamics — The Terrain of Learning. Bilingual (Chinese/English), 4 volumes, 12 chapters, 30+ print-grade figures. Completely free (CC BY-NC-SA 4.0). The core argument: gradient descent is not optimization. It's terrain motion. The loss function is a landscape. The gradient is the direction of slope. The optimizer is how you choose each step. Once you see it this way, everything clicks: ResNet = explicit Euler integration on a vector field. The residual branch is the vector field. Each layer takes one Euler step. GPT autoregression = implicit-state Euler iteration. Stable where explicit Euler explodes. That's why transformers handle long-range dependencies. DEQ = the Banach fixed-point theorem in production. The forward pass is root-finding. There are no layers to backprop through. KL divergence = a Bregman divergence on the entropy landscape. Your belief space is curved, not flat. Chain-of-thought reasoning = hidden states flowing along a reasoning field toward an attractor basin. Correct answers have wide basins. The number of reasoning steps is determined by the terrain, not by the problem. Diffusion models = systems flowing downhill along a score vector field, from noise to structure, from high energy to low energy. The book traces one idea across 337 years — from F=ma (Newton, 1687) to H=T+V (Hamilton, 1833) to loss landscape + gradient field (2020s). Hamilton replaced a catalog of forces with one geometric object. This book does the same for deep learning. GitHub: https://github.com/datawhalechina/learning-terrain Discussion: https://github.com/datawhalechina/learning-terrain/discussions/2 Convergence is not hope. Convergence is geometry. You see.
View all activity
Organizations
AbstractPhil
's datasets
71
Sort: Recently updated
AbstractPhil/diffusion-pretrain-set-ft1-1024
Viewer
•
Updated
5 days ago
•
1.14M
•
653
AbstractPhil/sdxl-qwen-phase1-cache
Viewer
•
Updated
9 days ago
•
86k
•
706
AbstractPhil/geolip-sdxl-fid-scoring
Viewer
•
Updated
10 days ago
•
2.8k
•
52
AbstractPhil/sdxl-qwen-phase0
Viewer
•
Updated
11 days ago
•
86k
•
1.1k
•
3
AbstractPhil/diffusion-pretrain-set-ft1
Viewer
•
Updated
19 days ago
•
1.29M
•
4.99k
•
1
AbstractPhil/IMDB-PUBLIC-SCRAPED
Preview
•
Updated
27 days ago
•
55
•
1
AbstractPhil/ldhnam-deepfashion_controlnet
Viewer
•
Updated
27 days ago
•
26k
•
35
AbstractPhil/ffhq_flux_latents_repaired
Viewer
•
Updated
27 days ago
•
40.8k
•
596
AbstractPhil/synthetic-characters
Viewer
•
Updated
27 days ago
•
149k
•
1.98k
AbstractPhil/CN_pose3D_V10_512
Viewer
•
Updated
28 days ago
•
66.5k
•
344
AbstractPhil/CN_pose3D_V7_512
Viewer
•
Updated
28 days ago
•
255k
•
732
AbstractPhil/synthetic-object-relations-json
Viewer
•
Updated
28 days ago
•
5k
•
156
AbstractPhil/cc-task1-json
Preview
•
Updated
29 days ago
•
1.07k
AbstractPhil/cc-prompts-sharded
Updated
May 15
•
196
AbstractPhil/json-coco-format
Viewer
•
Updated
May 14
•
129k
•
328
AbstractPhil/svae-freckles-4096-cifar10
Viewer
•
Updated
Apr 10
•
60k
•
55
AbstractPhil/ryan-spearman-prepared-features
Viewer
•
Updated
Mar 27
•
1
•
412
AbstractPhil/conceptual-captions-12m-webdataset-berts
Viewer
•
Updated
Mar 20
•
32.3M
•
4.45k
•
1
AbstractPhil/bertenstein-v1
Viewer
•
Updated
Mar 7
•
37.4k
•
2.51k
AbstractPhil/residual-thinking-embeddings
Updated
Mar 3
•
155
AbstractPhil/synthetic-object-relations
Viewer
•
Updated
Jan 28
•
100k
•
1.2k
•
1
AbstractPhil/imagenet-synthetic
Viewer
•
Updated
Jan 24
•
30k
•
2.4k
AbstractPhil/ffhq_with_llava_shorter_captions_flux_latents
Viewer
•
Updated
Jan 23
•
40.8k
•
208
AbstractPhil/flux-schnell-teacher-latents
Viewer
•
Updated
Jan 22
•
121k
•
145
AbstractPhil/bulk-coco-features
Viewer
•
Updated
Dec 28, 2025
•
4.19M
•
487
AbstractPhil/dataset-code-test
Viewer
•
Updated
Dec 28, 2025
•
120
•
47
AbstractPhil/foldl-midi
Preview
•
Updated
Nov 8, 2025
•
13
AbstractPhil/sd15-latent-distillation-500k
Viewer
•
Updated
Nov 7, 2025
•
734k
•
108
AbstractPhil/bulk-sd15-feature-extract
Viewer
•
Updated
Oct 26, 2025
•
100
•
16
AbstractPhil/imagenet-clip-features
Viewer
•
Updated
Oct 2, 2025
•
7.16M
•
767
•
1
Previous
1
2
3
Next