You’re in a coffee shop, pretending to work.
Across the room, a couple is on a date. It’s not going well. You’re not trying to listen, but you see it: I saw him scrolling on his phone.[TRANS]
Your brain just did something incredible.
Forget the boring textbook name for this grammar ("verbs of perception + object + present participle"). That tells you the rule, but not the feeling.
What you really did was switch your brain into “Live Photo” mode. You didn't report a finished event; you captured a specific, ongoing moment in time.
English in “Live Photo” Mode
Think of verbs like see, hear, feel, and watch as the camera app on your phone. You have two ways to capture a scene.
1. The Regular Photo (Base Verb): This captures a complete action, a finished fact.
I saw the man cross the street.[TRANS]
He started on one side and finished on the other. It's a simple report. The event is over.
2. The Live Photo (-ing Form): This captures the action in progress.
I saw the man crossing the street.[TRANS]
You’re zoomed in on the middle of the action, sharing the feeling of it unfolding. You're sharing the vibe.
Reporting a Fact vs. Sharing an Experience
The choice between these forms isn't just about timing—it’s about your role. Are you simply reporting a fact, or are you pulling the listener into a personal, sensory experience?
The -ing form makes your listener feel like they were there with you, seeing what you saw and hearing what you heard. It’s more intimate, personal, and dramatic.
[EXAMPLE_1]
[ENG] I heard my neighbors arguing last night.[TRANS]
[TRANS]
[NOTE] You didn't hear the entire argument from start to finish. You caught a piece of it—the angry sounds vibrating through the wall. You're sharing the unpleasant experience.
[EXAMPLE_2]
[ENG] I felt someone touching my shoulder on the crowded train.[TRANS]
[TRANS]
[NOTE] This captures the creepy, ongoing sensation. The feeling itself is the main event. I felt someone touch my shoulder[TRANS] is just a quick, completed tap. The -ing form emphasizes the unwelcome duration.
The Witness vs. The Director
Ultimately, it's about storytelling. Are you a neutral witness giving a statement, or are you a movie director creating a scene?
The Witness (Base Verb) is objective and detached. It focuses on the result of an action. It’s a summary for the record.
I saw the car speed through the intersection.[TRANS]
This feels like a police report. The event is over. The facts are filed.
The Director (-ing Form) is subjective and immersive. It focuses on the texture of a moment. It puts your audience right inside your head.
I saw the car speeding towards the intersection.[TRANS]
This feels like a live broadcast. You share the rising tension and the danger of the moment as it happens.
This isn't just grammar. It’s the difference between saying “this happened” and saying “I was there, and this is what it felt like.”
The Golden Rule: Use the base verb to report a finished fact. Use -ing to share an unfolding experience. Master this, and you're no longer just speaking English—you're directing the movie of your own life.
From the balcony, we saw the parade marching down the street.[TRANS]
I couldn't sleep because I could hear my neighbors arguing through the walls.[TRANS]
He woke up suddenly, feeling something crawling up his arm.[TRANS]
I love to sit at the café and watch the world going by.[TRANS]
As he gave the speech, I noticed his hands shaking slightly.[TRANS]
The detective observed the suspect pacing back and forth in the interrogation room.[TRANS]
The moment I walked in, I could smell dinner cooking in the kitchen.[TRANS]
She spent the evening listening to her grandfather telling stories about his youth.[TRANS]